WO2023031632A1

WO2023031632A1 - Encoder, decoder and communication system and method for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding, and method of training an encoder neural network and decoder neural network for use in a communication system

Info

Publication number: WO2023031632A1
Application number: PCT/GB2022/052266
Authority: WO
Inventors: Deniz GUNDUZ; David Burth KURKA; Tze-Yang TUNG
Original assignee: Imperial College Innovations Ltd
Current assignee: Ip2ipo Innovations Ltd
Priority date: 2021-09-06
Filing date: 2022-09-06
Publication date: 2023-03-09
Anticipated expiration: 2024-03-06
Also published as: GB202112665D0

Abstract

A communication system is disclosed for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding. The communication system comprises a transmitter having an encoder for encoding input data from an information source for transmission of a transformed version of the input data across a communication channel. The encoder has a key item encoder neural network for encoding data items as key items independent of any other data item in the sequence, and an interpolation item encoder neural network for encoding data items as interpolation items using data representing the input data item and at least one previous data item in the sequence. The communication system also comprises a receiver having a decoder including complementary key item decoder neural network and interpolation item decoder neural network. The receiver is for receiving and demodulating a noise-affected version of the encoder output vectors based on the signal transmitted across the communication channel and passing them to the decoder for reconstructing the sequence of correlated data items. A training method is also disclosed in which the key item encoder neural network and interpolation item encoder neural network are trained together with the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input-output pairs of training data items.

Description

ENCODER, DECODER AND COMMUNICATION SYSTEM AND METHOD FOR CONVEYING SEQUENCES OF CORRELATED DATA ITEMS FROM AN INFORMATION SOURCE ACROSS A COMMUNICATION CHANNEL USING JOINT SOURCE AND CHANNEL CODING, AND METHOD OF TRAINING AN ENCODER NEURAL NETWORK AND DECODER NEURAL NETWORK FOR USE IN A COMMUNICATION SYSTEM [0001] This present application relates to an encoder, decoder and communication system comprising a transmitter and receiver incorporating the encoder and decoder for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding. In particular, the encoder and decoder are neural networks and in embodiments the information source is a video and the correlated data items are video frames. BACKGROUND [0002] An aim of a data communication system is to efficiently and reliably send data from an information source over a communication channel from a transmitter at as high a rate as possible with as few errors as achievable in view of the channel noise, to enable a faithful representation of the original information source to be recovered at a receiver. [0003] Information sources providing sequences of correlated data items which share similarities and encapsulate data redundancy from one item in the sequence to the next can represent a significant data payload for transmission between transmitters and receivers over communications channels. For example, video content as a sequence of video frames containing images that are typically heavily correlated over time as the video develops. For example, a video of a largely static scene such as from a security camera remains largely unchanged from one video frame to the next. As of 2021, video transmission makes up around 80% of traffic on the Internet by volume, and the data burden on transmitters and receivers to correctly and efficiently transmit video data and other correlated sequences of data over communication channels is high. [0004] Most digital communication systems today include a source encoder and separate channel encoder at a transmitter and a source decoder and separate channel decoder at a receiver. [0005] In digital communication systems, to transmit data from the information source over a communication channel, the symbols of source data are first digitally compressed into bits by the source encoder. The goal in source coding is to encode the sequence of source symbols into a coded representation of data elements to reduce the redundancy in the original sequence of source symbols. In lossless compression one has to remove redundancy such that the original information source can still be reconstructed as the original version from the coded representation, while lossy compression allows a certain amount of degradation in the reconstructed version under some specified distortion measure, for example squared error. For videos, H264/MPEG is an example of a lossy source compression standards widely used in practice. Compressing the information source using a source encoder before transmission means that fewer resources are required for that transmission. [0006] Once the data from the information source has been encoded to compress it down in size, to transfer this representation over a communication channel, the output of the source encoder is then provided to a channel encoder. The goal of the channel encoder is to encode the compressed data representation in a structured way using a suitable Error Correction Code (ECC) by adding redundancy such that even if some of these bits are distorted or lost due to noise over the channel, the receiver can still recover the original sequence of bits reliably. The amount of redundancy that is added depends on the statistical properties of the underlying communication channel and the target Bit Error Rate (BER). Generally, such channel coding schemes using Forward Error Correction (FEC) provide for a faithful recovery of the transmitted data elements (such as a compressed data source) despite the noise in the channel. However, as channel noise increases, BER will increase drastically and, when the channel noise is too high, the signal transmission will drop out completely, meaning the transmitted data cannot be recovered. There are many different channel coding techniques in practice that provide various complexity and performance trade-offs. Turbo codes and Low-density parity-check (LDPC) codes are examples of ECCs that are commonly used in modern communication systems such as WiMAX and fourth generation Long-Term Evolution (LTE) mobile communications. [0007] The coded bits at the output of the channel encoder are transmitted over the channel using a modulator. The modulator converts the bits into signals that can be transmitted over the communication medium. For example, in wireless systems using Quadrature Modulation of two out-of-phase amplitude modulated carrier signals, the transmitted waveform is specified by its In-Phase (I) and Quadrature (Q) components, and a modulator typically has a discrete set of pre-specified I and Q values, called a constellation, and each group of coded information bits are mapped to a single point in this constellation. Example modulation schemes include phase shift keying (PSK) and quadrature amplitude modulation (QAM). [0008] The receiver receives and demodulates (for example, by coherent demodulation) a sequence of noisy symbols, where the noise has been added by the communication channel. These noisy demodulated symbols are then mapped to sequences of data elements by a channel decoder. The decoded data elements are then passed to the source decoder, which decodes these data elements to try to reconstruct a representation of the original input source symbols to reconstruct the information source. [0009] Naturally, the source encoder and decoder are designed jointly, as are the channel encoder and decoder, but the source encoder/decoder and channel encoder/decoder are designed and operate separately to perform very different functions. [0010] The main advantage of separate source and channel coding is the modularity it provides. This means that the same channel encoder and decoder can be used in conjunction with any source encoder and decoder. That is, as long as the source encoder outputs data elements that can be encoded by the channel encoder, it does not matter if these bits come from an image compressor or a video encoder. Thus, a channel encoder can encode data elements for transmission over a channel irrespective of the data elements or the information source from which they have been derived. [0011] Similarly, the source encoder and decoder can be operated in conjunction with any channel encoder and decoder to transmit the encoded source symbols over a communication channel. Thus, a source encoder can encode data elements for subsequent coding by the channel encoder independently of which channel encoder is used. [0012] The transmission of sequences of correlated data items, such as video, in this way is particularly onerous can have significant implications and requirements on quality, latency and error rate, in particular for continuous transmission and reception of continuous flows of such data, as in video streams. [0013] For wireless video transmission, the problem is broken down into two core components: a source encoder that converts the video into a sequence of bits of the shortest possible length, from which a reconstruction of the original video sequence is possible within an allowable distortion; and a channel encoder that introduces redundancies such that the source encoded information is protected against channel distortions and interference. This separate source and channel coding design provides modularity and allows independent optimisation of each component, which was theoretically shown by Shannon (Shannon, 1948) to be optimal for point-to-point communication over static channel conditions in the asymptotic infinite blocklength regime. [0014] However, as more communication intensive paradigms emerge, such as wireless virtual reality (VR) and drone-based surveillance systems, which have ultra-low latency requirements and unpredictable channel conditions, the limits of the separation-based designs are beginning to rear. In such scenarios, the compression delay and the feedback necessary to track the instantaneous channel condition under constant variation are challenging. Also, the theoretical optimality of separation for communication utilising infinite blocklengths with unlimited delay and complexity becomes less relevant for low-latency systems that require short blocklengths and low complexity operations. Moreover, separation-based communication leads to what is known as the cliff-effect, which is when the channel condition deteriorates below the condition that the channel encoder had anticipated, and the source information is lost completely, leading to a cliff edge deterioration of the system performance. As a result, most current systems operate at a much more conservative data transmission rate than that is suggested by the instantaneous channel capacity, and employ additional error correction mechanisms through automatic repeat request (ARQ). [0015] For example, due to the cliff edge effect, if a bit error rate in the transmission of a video over a communications channel exceeds a maximum error rate for the channel decoder to be able to decode the received signal, then the cliff edge effect causes the transmission to drop out completely, which can present significant challenges for live streamed video, such as from a drone to a base station. This can lead to discontinuous reception of transmitted video, and can force the source encoder having to encode the video at ever more lossy compression levels and lower resolutions. [0016] Effective and efficient transmission of sequences of correlated data items, such as video, over noisy communications channels that allows continuous reception while keeping errors in reception to a minimum, is therefore desirable in view of the particular performance requirements and volumes of this type of data to be transmitted. [0017] It is in the above context that the present disclosure has been devised. BRIEF SUMMARY OF THE DISCLOSURE [0018] Recently, new alternatives for the design of wireless communication systems have been proposed. For example, machine learning (ML) approaches to encoding an information source to channel symbols for transmission across a communication channel have been proposed. By using ML in this way, new encoding schemes can be discovered and freely produced that optimise the efficient transmission of an information source across a noisy communication channel, without being limited to existing source or channel coding paradigms, often outperforming these legacy handcrafted approaches. Such new approaches to encoding information sources are optimised using ML but do not in any way take into account any correlation in the information source, for example, between sequences of frames of video data. [0019] Thus, viewed from one aspect, the present disclosure provides an encoder, a decoder and communication system for conveying data from an information source across a communication channel using joint source and channel coding, comprising a transmitter and a receiver. The communication system comprises a transmitter having an encoder for encoding input data from an information source for transmission of a transformed version of the input data across a communication channel. The encoder has a key item encoder neural network for encoding data items as key items independent of any other data item in the sequence, and an interpolation item encoder neural network for encoding data items as interpolation items using data representing the input data item and at least one previous data item in the sequence. The communication system also comprises a receiver having a decoder including complementary key item decoder neural network and interpolation item decoder neural network. The receiver is for receiving and demodulating a noise-affected version of the encoder output vectors based on the signal transmitted across the communication channel and passing them to the decoder for reconstructing the sequence of correlated data items. A training method is also disclosed in which the key item encoder neural network and interpolation item encoder neural network are trained together with the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input-output pairs of training data items. [0020] Viewed from another aspect, the present disclosure provides an encoder for use in a transmitter of a communications system for conveying sequences of correlated data items from an information source across a communications channel using joint source and channel coding. The encoder comprises a key item encoder neural network for encoding data items selected from the sequence to serve as a key item that can be directly reconstructed by the decoder, the key item encoding being based on the input data item and being independent of any other data item in the sequence. The encoder further comprises an interpolation item encoder neural network for encoding data items selected from the sequence to serve as an interpolation item that can be reconstructed by the decoder using interpolation, the interpolation item encoding using data representing the input data item and at least one previous data item in the sequence. The key item encoder neural network and interpolation item encoder neural network have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in an encoder input layer thereof to encoder output vectors provided at nodes of an encoder output layer thereof, the encoder output vectors being used for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the data items across a communication channel. The key item encoder neural network and interpolation item encoder neural network have in the communications system respective complementary key item decoder neural network and interpolation item decoder neural network for receiving a noise-affected version of the encoder output vector from a receiver receiving and demodulating the signal transmitted across the communication channel and reconstructing the input vector to generate a representation of the input data item. The connecting node weights of the key item encoder neural network and interpolation item encoder neural network have been trained together with the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input-output pairs of training data items. [0021] Viewed from another aspect, the present disclosure provides a decoder for use in a receiver of a communications system for conveying sequences of correlated data items from an information source across a communications channel using joint source and channel coding. The decoder comprises a key item decoder neural network for decoding data items from the sequence indicated as key items, the key item decoding being based on a noise-affected version of an encoder output vector generated at a transmitter by a complementary key item encoder neural network to encode the data item based on the input data item independent of any other data item in the sequence, the key item decoder neural network being configured for reconstructing the input vector to generate a representation of the input data item directly from the noise-affected version of the encoder input vector and independently of any other data item in the sequence. The decoder further comprises an interpolation item decoder neural network for decoding data items indicated as interpolation items, the key item decoding being based on data representing at least one previous data item in the sequence and a noise-affected version of an encoder output vector generated at a transmitter by a complementary interpolation item encoder neural network to encode the data item based on data representing the input data item and at least one previous data item in the sequence, the noise-affected version of the encoder output vector having been received and demodulated at the receiver based on the signal transmitted across the communication channel, the key item decoder neural network being configured for reconstructing the input vector by interpolation from reconstructions of representations of other data items in the sequence to generate a representation of the input data item. The key item decoder neural network and interpolation item decoder neural network have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in a decoder input layer thereof to decoder output vectors provided at nodes of an decoder output layer thereof, the decoder output vectors providing a reconstruction of the encoder input vector to generate a representation of the input data item. The connecting node weights of the key item decoder neural network and interpolation item decoder neural network have been trained together with the respective complementary key item encoder neural network and interpolation item encoder neural network, to minimise an objective function characterising a reconstruction error between input-output pairs of training data. [0022] Viewed from another aspect, the present disclosure provides a communication system for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding. The communication system comprises a transmitter including an encoder and a receiver including a decoder. The encoder comprises a key item encoder neural network for encoding data items selected from the sequence to serve as a key item that can be directly reconstructed by the decoder, the key item encoding being based on the input data item and being independent of any other data item in the sequence. The encoder further comprises an interpolation item encoder neural network for encoding data items selected from the sequence to serve as an interpolation item that can be reconstructed by the decoder using interpolation, the interpolation item encoding using data representing the input data item and at least one previous data item in the sequence. The key item encoder neural network and interpolation item encoder neural network have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in an encoder input layer thereof to encoder output vectors provided at nodes of an encoder output layer thereof, the encoder output vectors being used for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the data items across a communication channel. The transmitter is configured for transmitting signals over the communication channel based on signal values of the encoder output vectors of the key item encoder neural network and interpolation item encoder neural network. The receiver is configured for receiving and demodulating a noise-affected version of the encoder output vectors based on the signal transmitted across the communication channel and passing them to the decoder for reconstructing the sequence of correlated data items. The decoder comprises a key item decoder neural network for decoding data items from the sequence indicated as key items, the key item decoding being based on a noise-affected version of an encoder output vector generated at a transmitter, the key item decoder neural network being configured for reconstructing the input vector to generate a representation of the input data item directly from the noise-affected version of the encoder input vector and independently of any other data item in the sequence. The decoder further comprises an interpolation item decoder neural network for decoding data items indicated as interpolation items, the key item decoding being based on data representing at least one previous data item in the sequence and a noise-affected version of an encoder output vector generated at a transmitter. The interpolation item decoder neural network is configured for reconstructing the input vector by interpolation from reconstructions of representations of other data items in the sequence to generate a representation of the input data item. The key item decoder neural network and interpolation item decoder neural network have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in a decoder input layer thereof to decoder output vectors provided at nodes of an decoder output layer thereof, the decoder output vectors providing a reconstruction of the encoder input vector to generate a representation of the input data item. The connecting node weights of the key item encoder neural network and interpolation item encoder neural network have been trained together with the connecting node weights of the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input-output pairs of training data items. [0023] Viewed from another aspect, the present disclosure provides a method for conveying sequences of correlated data items from an information source across a communications channel using joint source and channel coding. The method comprises, at a transmitter, for each data item in the sequence: selecting data items from the sequence of data items to serve as key items and interpolation items; encoding data items to serve as key items using a key item encoder neural network, the key item encoding being based on the input data item and being independent of any other data item in the sequence; encoding data items to serve as interpolation items using an interpolation item encoder neural network, the interpolation item encoding using data representing the input data item and at least one previous data item in the sequence. The key item encoder neural network and interpolation item encoder neural network have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in an encoder input layer thereof to encoder output vectors provided at nodes of an encoder output layer thereof, the encoder output vectors being used for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the data items across a communication channel. The method further comprises, at the transmitter, transmitting signals over a communication channel based on signal values of encoder output vectors of the key item encoder neural network and interpolation item encoder neural network. The method further comprises, at a receiver: receiving and demodulating a noise-affected version of the encoder output vectors based on the signal transmitted across the communication channel; decoding data items from the sequence indicated as key items using a key item decoder neural network based on a noise-affected version of the encoder output vector for the data item, the key item decoder neural network being configured for reconstructing the input vector to generate a representation of the input data item directly from the noise-affected version of the encoder input vector and independently of any other data item in the sequence; and decoding data items from the sequence indicated as interpolation items using an interpolation item decoder neural network based on data representing at least one previous data item in the sequence and the noise-affected version of an encoder output vector for the data item, the key item decoder neural network being configured for reconstructing the input vector by interpolation from reconstructions of representations of other data items in the sequence to generate a representation of the input data item. The key item decoder neural network and interpolation item decoder neural network have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in a decoder input layer thereof to decoder output vectors provided at nodes of an decoder output layer thereof, the decoder output vectors providing a reconstruction of the encoder input vector to generate a representation of the input data item. The connecting node weights of the key item encoder neural network and interpolation item encoder neural network have been trained together with the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input-output pairs of training data items. [0024] Viewed from another aspect, the present disclosure provides a computer readable medium comprising one or more instructions which when executed cause at least one of: a transmitter; and a receiver; to operate in accordance with the above-described method. [0025] In accordance with these aspects of the present disclosure, a machine-learned for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding, which optimises the reconstructed quality end-to-end, is achieved. This method deviates from the separation-based designs by optimising a single encoder and decoder, which jointly provide the same or better performance compared to expert designed, modular systems. [0026] In embodiments, the encoder is configured to encode the sequences of correlated data items from an information source for transmission across the communications channel as a streaming media. In embodiments, the encoder is configured to encode the sequences of correlated data items from an information source into a static media file. [0027] In embodiments, the sequences of correlated data items are a series of image frames providing a video. In embodiments, the correlated data items are each represented by a 3D matrix with a depth based on the colour channels, a height based on the height of the frame and a width based on the width of the frame. In embodiments, the encoder input layers of the key item encoder neural network and/or the interpolation item encoder neural network are configured to receive video frames as input vectors. [0028] In embodiments, the encoder is configured to create a coded and compressed representation of the input data as an encoder output vector for transmission across the communications channel, and the decoder is configured to decode the noise-affected version of the encoder output vector back into an uncompressed reconstruction of the input data. In embodiments, the quantity of information included in the output vector of the key item encoder neural network and/or the interpolation item encoder neural network is smaller than the input vector. [0029] In embodiments, the interpolation encoder input layer is configured such that the interpolation item encoding uses data received at the input layer thereof representing the input data item in input data space and at least one previous data item in the sequence in input data space. In embodiments, the data representing at least one previous data item in the sequence used by the interpolation item encoder to encode interpolation items comprises: the motion representation information representing the relative motion between the data item and at least one other data item in the sequence; and the residual information between the data item and a motion compensated version of at least one other data item in the sequence using the motion representation information in respect of that data item. [0030] In other embodiments, the interpolation encoder input layer is configured such that the interpolation item encoding uses data received at the input layer thereof representing: the input data item in the latent space defined by the output of the key item encoder neural network; and at least one previous data item in the sequence in the latent space defined by the output of the key item encoder neural network. In embodiments, the data representing the input data item used by the interpolation item encoder to encode interpolation items comprises: the key item encoder output vector encoded for the data item by the key item encoder neural network; and wherein the data representing at least one previous data item in the sequence used by the interpolation item encoder to encode interpolation items comprises an encoder output vector transmitted by the encoder for at least one previous data item in the sequence. In embodiments, the data representing at least one previous data item in the sequence used by the interpolation item decoder to decode interpolation items comprises a noise-affected version of an encoder output vector or a reconstruction of the encoder input vector providing a representation of the input data item for at least one previous data item in the sequence. [0031] In embodiments, the data representing at least one previous data item in the sequence used by the interpolation item encoder to encode interpolation items further comprises data representing at least one subsequent data item in the sequence. [0032] In embodiments, the encoder and decoder further comprise a static control module configured to select data items from the sequence of data items as key items and interpolation items according to a fixed order specified by a predetermined group of items, the static control module being further configured to use the key item encoder and decoder to encode and decode data items selected as key items, and to use the interpolation item encoder and decoder to encode and decode data items selected as interpolation items. [0033] In other embodiments, the encoder further comprises a dynamic control module having a dynamic decision agent configured to dynamically choose whether the data item is to serve as a key item or an interpolation item. In embodiments, the dynamic decision agent is configured to dynamically choose whether the data item is to serve as a key item or an interpolation item based at least on one or more of: the current data item; the number of data items transmitted since last key item; a current average channel utilisation; and a channel utilisation constraint. In embodiments, the dynamic decision agent is configured to dynamically choose whether the data item is to serve as a key item or an interpolation item so that the average channel utilisation is below the channel utilisation constraint. In embodiments, the dynamic control module is configured to: select, based on a decision output by decision agent for the data item, whether the data item is to serve as a key item or an interpolation item; if the data item is selected to serve as a key item, use the key item encoder to encode the data item in the sequence to provide a key item encoder output vector for the item, the encoder being configured for transmitting the key item encoder output vector on the communications channel. In embodiments, the dynamic control module is further configured to: if the data item is selected to serve as an interpolation item, use the interpolation encoder to encode the data item to provide an interpolation item encoder output vector for the item, the encoder being configured for transmitting the interpolation item encoder output vector on the communications channel. In embodiments, the dynamic decision agent is configured to generate data mapping for the sequence of data items, which data items are key data items and which data items are interpolation data items, for transmission across the communications channel and for use by the decoder to determine whether the received noise-affected version of an encoder output vector should be decoded by the key item decoder neural network or the interpolation item decoder neural network. [0034] In embodiments, the encoder output layers of the interpolation item encoder neural network and the decoder input layers of the interpolation item decoder neural network are divided into ordered blocks, and the neural networks are trained such that the interpolation item encoder neural network encodes descending ordering of information in increasing blocks of nodes, and such that, for interpolation items, the decoder reconstructs an increasingly refined representation of the input data with increasing blocks received in the noise-affected version of an encoder output vector. In embodiments, the communication system further comprises a bandwidth allocation module configured to determine, for each data item in the sequence selected to serve as an interpolation item, a number of blocks of the interpolation encoder output layer to be transmitted over the communication channel, so as to allocate the available bandwidth in the communications channel to the transmission of interpolation items. In embodiments, the bandwidth allocation module is further configured to determine the number of blocks of the interpolation encoder output layer to be transmitted over the communication channel to seek to minimise the reconstruction error between the representation of the input data reconstructed at the decoder and the input data encoded at the encoder. In embodiments, the bandwidth allocation module is configured to determine the number of blocks of the interpolation encoder output layer to be transmitted over the communication channel based on at least motion representation information determined to represent the relative motion between the data item and at least one other data item in the sequence. In embodiments, the bandwidth allocation module is configured to determine a number of blocks of the interpolation encoder output layer to be transmitted over the communication channel, so as to seek to optimally allocate the available bandwidth in the communications channel between a group, a set or the whole sequence of data items to be transmitted. [0035] In other embodiments, the interpolation encoder neural network is configured to: maintain and update an internal state as successive interpolation items of a group of consecutive interpolation items are encoded by the interpolation encoder neural network; and after successive interpolation items have been encoded into the internal state, to provide the internal state as the interpolation encoder output vector for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the group of consecutive interpolation data items across the communication channel. In embodiments, the encoder neural network is configured to output an encoder output vector for transmission for each key item and each group of consecutive interpolation items between key items. In embodiments, the interpolation decoder neural network is configured to: for a group of consecutive interpolation items, recursively decode the noise-affected version of the encoder output vector received from a receiver to thereby reconstruct the encoder input vectors of successive interpolation items to generate a representation of the input data items of the group of consecutive interpolation items. In embodiments, the interpolation encoder neural network and the interpolation decoder neural network are both provided by a recurrent neural network, optionally a Long Short-Term Memory (LSTM) network. [0036] In embodiments, the encoder output vectors provide values in a signal space that represent in-phase and quadrature components for modulation of the carrier signal for transmission over the communication channel. In other embodiments, the encoder output vectors provide values defining a probability distribution sampleable to provide values in a signal space that represent in-phase and quadrature components for modulation of the carrier signal for transmission over the communication channel. In embodiments, the encoder output vectors provide values in the signal space that are assigned exactly to symbol values of a predetermined finite set S of symbols of a predefined alphabet of carrier modulation signal values transmittable by the transmitter over the communication channel. In embodiments, the predefined alphabet is a fixed, predefined constellation of symbols for digitally modulating the carrier signal to encode the input data for transmission over the communication channel. [0037] In other embodiments, the encoder output vectors provide values corresponding to a predetermined finite set of symbols of an existing channel encoder and decoder scheme for transmission of data over the communication channel. Thus, besides random noise applied by the communication channel, the transformation applied by the communication channel may, in embodiments, also include an existing channel code. Thus, in these embodiments, the encoder and decoder may learn an optimum mapping of the input information source to inputs of an existing channel code of the communications channel that reduces reconstruction errors at the output of the decoder neural network. Although acting as an outer code in these embodiments, this learned coding of the encoder and decoder is still optimised based on the characteristics of the communication channel to reduce reconstruction errors, even though in these alternative embodiments the communication channel includes an existing channel code. This is unlike existing modular source codes which are defined independently of the random transformation applied by any channel. [0038] Viewed from another aspect, the present disclosure provides a method of training an encoder and a decoder for use in a communication system in accordance with the above aspects and embodiments of the present disclosure for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding. The method comprises: for input-output pairs of a set of training data items from the information source passed to the encoder, determining an objective function characterising a reconstruction error between input-output pairs of training data from the information source passed to the encoder and the representation of the input data reconstructed at the decoder; and using an appropriate optimisation algorithm operating on the objective function, updating the connecting node weights of the hidden layers of the key item encoder neural network, interpolation item encoder neural network, key item decoder neural network and interpolation item decoder neural network to seek to minimise the objective function. [0039] In embodiments, the encoder neural networks and decoder neural networks have been trained together using training data in which a model of the communication channel is used to estimate channel noise and add it to the transmitted signal values to generate a noise-affected version of the vector of signal values in the input-output pairs of training data. [0040] In embodiments, the encoder output layers of the interpolation item encoder neural network and the decoder input layers of the interpolation item decoder neural network are divided into ordered blocks, wherein the encoder output vector passed to the decoder input layer for each input-output pair during training is selected to have a random number of the ordered blocks, such that, following training, the interpolation item encoder neural network encodes descending ordering of information in increasing blocks of nodes, and such that, for interpolation items, the decoder reconstructs an increasingly refined representation of the input data with increasing blocks received in the noise-affected version of an encoder output vector. [0041] Viewed from another aspect, the present disclosure provides a computer readable medium comprising one or more instructions, which when executed cause a computing device to operate the above-described methods of training an encoder and a decoder for use in the above-described communication systems for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding. [0042] It will be appreciated from the foregoing disclosure and the following detailed description of the examples that certain features and implementations described as being optional in relation to any given aspect of the disclosure set out above should be understood by the reader as being disclosed also in combination with the other aspects of the present disclosure, where applicable. Similarly, it will be appreciated that any attendant advantages described in relation to any given aspect of the disclosure set out above should be understood by the reader as being disclosed as advantages of the other aspects of the present disclosure, where applicable. That is, the description of optional features and advantages in relation to a specific aspect of the disclosure above is not limiting, and it should be understood that the disclosures of these optional features and advantages are intended to relate to all aspects of the disclosure in combination, where such combination is applicable. BRIEF DESCRIPTION OF THE DRAWINGS [0043] Embodiments of the invention are further described hereinafter with reference to the accompanying drawings, in which: Figure 1 shows a communication system for conveying sequences of correlated data items, such as video, from an information source across a communications channel using joint source and channel coding in accordance with an example of the present disclosure; Figure 2 shows the encoding, transmission and reconstruction of sequences of correlated data items in the form of video frames from an information source using a communication system in accordance with an example of the present disclosure; Figure 3 shows an example run time method for the transmitter and the encoder in accordance with an example of the present disclosure; Figure 4 shows an example run time method for the receiver and the decoder in accordance with an example of the present disclosure; Figure 5 shows an example training time method for the neural networks of the encoder and decoder in accordance with an example of the present disclosure; Figure 6 shows a structure of an communication system in accordance with an example of the present disclosure, showing the use of a key item encoder and decoder and an interpolation item encoder and decoder encoding the interpolation information in input data space; Figure 7 shows an architecture of a key item encoder and decoder in use in the communication system shown in the example communication system of Figure 6 for encoding, transmitting and reconstructing data items selected as key items; Figure 8 shows an architecture of an interpolation item encoder and decoder and motion and residual modules in use in the communication system shown in the example communication system of Figure 6 for encoding, transmitting and reconstructing data items selected as interpolation items; Figure 9 shows an example run time method for the transmitter and the encoder neural network in accordance with the example communication system of Figure 6; Figure 10 shows an example run time method for the receiver and the decoder neural network in accordance with the example communication system of Figure 6; Figure 11 shows an architecture of a bandwidth allocation neural network in use in the communication system shown in the example communication system of Figure 6 for determining a number of blocks of the interpolation item encoder output vector for transmission for the items of a group of items; Figure 12 shows a structure of a communication system in accordance with an example of the present disclosure, showing the use of a key item encoder and decoder and an interpolation item encoder and decoder encoding the interpolation information in latent space; Figure 13 shows an architecture of an interpolation item encoder and decoder for use in the communication system shown in the example communication system of Figure 12 for encoding, transmitting and reconstructing data items selected as interpolation items; Figure 14 shows example algorithms for controlling the encoding and decoding data items for use in the communication system shown in the example communication system of Figure 12; Figure 15 shows an example run time method for the transmitter and the encoder neural network in accordance with the example communication system of Figure 12; Figure 16 shows an example run time method for the receiver and the decoder neural network in accordance with the example communication system of Figure 12; Figure 17 shows a performance of an example of the communication system of Figure 6 for encoding, transmitting and reconstructing example sequences of correlated data items at various contrast signal to noise (CSNR) ratios; Figure 18 shows a performance comparison of the performance envelope of the example of the communication system of Figure 6 as shown in Figure 17 and the performance of separate source coding by H.264 (i.e. MPEG-4 Part 10) and channel coding by a low-density parity- check (LDPC) code at different code rates for encoding, transmitting and reconstructing the example sequences of correlated data items at various contrast signal to noise (CSNR) ratios; Figure 19 shows a performance comparison of the performance envelope of another example of the communication system of Figure 6 and the performance of separate source coding by H.265 (i.e. MPEG-H Part 2) and channel coding by a low-density parity-check (LDPC) code at different code rates for encoding, transmitting and reconstructing other example sequences of correlated data items over a channel having AWGN at various signal to noise (SNR) ratios; Figure 20 shows a visual comparison of reconstructed frames of an example video encoded and transmitted across a channel having additive white Gaussian noise at 13dB, 3dB and - 4dB, by an example of the communication system of Figure 6 trained at different SNRs and by a separate source coding by H.264 (i.e. MPEG-4 Part 10) and channel coding by a low-density parity-check (LDPC) code using different channel code schemes; Figure 21 shows a performance comparison of another example of the communication system of Figure 6 and the performance of separate source coding by H.264/H.265 and channel coding by a low-density parity-check (LDPC) 3/416QAM code for encoding, transmitting and reconstructing other example sequences of correlated data items over a channel having AWGN at a signal to noise (SNR) ratio of 20dB for different bandwidth compression rates; and Figure 22 shows a performance comparison of the performance envelope of another example of the communication system of Figure 6, showing the difference in performance of the system having a uniform bandwidth allocation to the frames in a group of pictures, a non-uniform bandwidth allocation in accordance with a pre-determined heuristic, and an optimal bandwidth allocation based on embodiments of the present disclosure in which a bandwidth allocation module is used. DETAILED DESCRIPTION [0044] Hereinafter, embodiments of the disclosure are described with reference to the accompanying drawings. However, it should be appreciated that the disclosure is not limited to the embodiments, and all changes and/or equivalents or replacements thereto also belong to the scope of the disclosure. The same or similar reference denotations may be used to refer to the same or similar elements throughout the specification and the drawings. [0045] As used herein, the terms “have,” “may have,” “include,” or “may include” a feature (e.g., a number, function, operation, or a component such as a part) indicate the existence of the feature and do not exclude the existence of other features. [0046] As used herein, the terms “A or B,” “at least one of A and/or B,” or “one or more of A and/or B” may include all possible combinations of A and B. For example, “A or B,” “at least one of A and B,” “at least one of A or B” may indicate all of (1) including at least one A, (2) including at least one B, or (3) including at least one A and at least one B. [0047] As used herein, the terms “first” and “second” may modify various components regardless of importance and do not limit the components. These terms are only used to distinguish one component from another. For example, a first user device and a second user device may indicate different user devices from each other regardless of the order or importance of the devices. For example, a first component may be denoted a second component, and vice versa without departing from the scope of the disclosure. [0048] It will be understood that when an element (e.g., a first element) is referred to as being (operatively or communicatively) “coupled with/to,” or “connected with/to” another element (e.g., a second element), it can be coupled or connected with/to the other element directly or via a third element. In contrast, it will be understood that when an element (e.g., a first element) is referred to as being “directly coupled with/to” or “directly connected with/to” another element (e.g., a second element), no other element (e.g., a third element) intervenes between the element and the other element. [0049] As used herein, the terms “configured (or set) to” may be interchangeably used with the terms “suitable for,” “having the capacity to,” “designed to,” “adapted to,” “made to,” or “capable of” depending on circumstances. The term “configured (or set) to” does not essentially mean “specifically designed in hardware to.” Rather, the term “configured to” may mean that a device can perform an operation together with another device or parts. [0050] For example, the term “processor configured (or set) to perform A, B, and C” may mean a generic-purpose processor (e.g., a CPU or application processor) that may perform the operations by executing one or more software programs stored in a memory device or a dedicated processor (e.g., an embedded processor) for performing the operations. [0051] The terms as used herein are provided merely to describe some embodiments thereof, but not to limit the scope of other embodiments of the disclosure. It is to be understood that the singular forms “a,” “'an,” and “the” include plural references unless the context clearly dictates otherwise. All terms including technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the embodiments of the disclosure belong. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein. In some cases, the terms defined herein may be interpreted to exclude embodiments of the disclosure. [0052] As used throughout the Figures, features or method steps are shown outlined in broken lines to indicate that such features or method steps are optional features for provision in some embodiments, but which are not provided in all embodiments to implement aspects of the disclosure. That is, aspects of the disclosure do not require these optional features to be included, or steps to be performed, and they are merely included in illustrative embodiments to provide further optional implementation details. [0053] Reference will now be made to Figure 1 and Figure 2. Figure 1 shows a communication system 100 comprising a transmitter 110 for conveying sequences of correlated data items, such as video, from an information source 111 across a communication channel 120 to a receiver 130 using joint source and channel coding in accordance with an example of the present disclosure. Figure 2 shows the encoding, transmission and reconstruction of sequences of correlated data items in the form of video frames from an information source 11 using the communication system 100. [0054] The transmitter 110 and receiver 130 may each be part of respective electronic devices for transmitting or receiving sequences of correlated data items, such as video. For example, the electronic device coupled to the transmitter 110 or receiver 130 may be a smartphone, a tablet, a personal computer such as a desktop computer, a laptop computer, a netbook computer, a workstation, a server, a wearable device such as a smart watch, smart glasses, a head-mounted device or smart clothes, an airborne or land drone, a robot or other autonomous device such as industrial or home robots, a security control panel, a gaming console, a security camera, a microphone, or an Internet of Things device for sensing or monitoring, such as a smart meter, various sensors, an electric or gas meter, a medical device such as a portable medical measuring device, a blood sugar measuring device, a heartbeat measuring device, or a body temperature measuring device, a navigation device, a global positioning system (GPS) receiver, an event data recorder (EDR), a flight data recorder (FDR), avionics, point of sale devices. The electronic device may also be a base station or relay in a radio communication system, which may be capable of operating in accordance with one or more communications standards, such as the wireless communication standards 802.11xx for WiFi maintained by the Institute of Electrical and Electronics Engineers (IEEE), and the 3G, LTE and NR standards for cellular communications maintained by the 3rd Generation Partnership Project (3GPP), or any other radio transceiver for receiving signals transmitted across the communications channel, and decoding them for onward transmission, for example on the Internet. [0055] The transmitter 110 includes an information source 111, at least one processor 112, memory 113 and a carrier modulator 118 coupled to an antenna 119 for transmitting data over communication channel 120. A bus system (not shown) may be provided which supports communication between at the least one processor 112, memory 113, carrier modulator 118 and antenna 119. [0056] The information source 111 is a source of data items to be transmitted over the communication channel 120 by the transmitter 110. The information source 111 is a source of data provided as a sequence of correlated data items x_1, x_2, x_3, … x_n in which the correlation is manifested as some degree of redundancy in the data in adjacent items in the sequence. The data items x_1, x_2, x_3, … x_n may, for example, be frames of a video in which the pixel data presented in consecutive video frames may be correlated in location and brightness. In the example information source 111 shown in Figure 2, which shows a video captured by a security camera of largely static scene of a harbour under constant illumination, there may be a significant amount of redundancy from one video frame to the next. Where the video captures a moving item in a scene, such as a moving boat in the scene, or a moving scene caught by a panning camera, differences between one frame and the next may be analysable by optical flow analysis. The information source 111 is not limited to being a source of video data, and the present disclosure is intended to be applicable to sources of any suitable sequences of correlated data items, where the correlation may occur in time or space, or both, or along any other suitable dimension over which the data items are correlated. For example, the information source 111 may be a source of sensor data from one or more sensors sensing one or more physical characteristics of a system that vary over time or location within the physical system in such a way that the data items are correlated. The data items may be captured in steps of equal or unequal intervals along the dimension in which they are correlated. [0057] The information source 111 is any information source suitable for arranging as a sequence of source symbols or fundamental data elements (for example, bits). For example, where the information source 111 is a video source that provides the sequence of data items x_1, x_2, x_3, … x_n as video frames, the correlated data items x_1, x_2, x_3, … x_n may each be represented by a 3D matrix with a depth based on the colour channels (normally 3 channels for RGB), a height, H, based on the height of the frame and a width, W, based on the width of the frame, i.e. x_n ∈ ℝ^H×W×3. [0058] The information source 111 may generate the correlated data items locally to the transmitter (such as a video captured by a camera coupled to the transmitter, such as in the electronic device of which the transmitter is a part) or it may be a source of data stored locally to the transmitter that was generated elsewhere, remotely from the transmitter 110. The encoding and transmission of the data items from the information source 111 may be performed asynchronously with the time at which the data items were generated, or it may be performed live or in real time, with the encoding being performed largely contemporaneously to the generation of the data items. Either way, the information source 111 may provide the data items for encoding and transmission as a static media file that is encoded and transmitted and then reconstructed and stored at the receiver 130 where it can be viewed in a player or conveyed further. The information source 111 may also provide the data items for encoding and transmission as a stream of video frames for encoding and transmission on the fly, which are then to be reconstructed at the receiver 130 where it can be viewed in a player or conveyed further for replay as a streaming video, in which case the received video stream may or may not be stored locally at the receiver to allow subsequent asynchronous replay. [0059] The information source 111 may store or generate ‘raw’ or ‘uncompressed’ data directly or indirectly representative of characteristics of the information source, to allow faithful reproduction of the information source 111 by a given combination of data processing hardware appropriately configured, for example by software or firmware. Alternatively, the data items may be pre-processed before being passed to the encoder 115 through an initial form encoding which may already compress the data items. This does not preclude the encoder of the present disclosure learning a further optimal joint source channel coding for the communication channel 120 to minimise reconstruction errors. Further still, the data items may represent segments of the data provided by the information source 111. For example, rather than each data item representing an individual video frame, the video frames may be divided into blocks or segments, with each block being represented by a separate sequence of data items. [0060] The processor 112 executes instructions that can be loaded into memory 113. The processor 112 can include any suitable number(s) and type(s) of processors or other devices in any suitable arrangement. Example types of processor 112 include microprocessors, microcontrollers, digital signal processors, field programmable gate arrays and application specific integrated circuits. [0061] The memory 113 may be provided by any structure(s) capable of storing and facilitating retrieval of information (such as data, program code, and/or other suitable information on a temporary or permanent basis). The memory 113 can represent a random access memory or any other suitable volatile or non-volatile storage device(s). The memory 113 may also contain one or more components or devices supporting longer-term storage of data, such as a ready only memory, hard drive, flash memory, or optical disc, which may store software code for loading into the memory 113 at runtime. In use, the processor 112 and memory 113 provide a Runtime Environment (RTE) 114 in which instructions or code loaded into the memory 113 can be executed by the processor to generate instances of software modules in the Runtime Environment 114. [0062] The memory 113 comprises instructions which, when executed by the one or more processors 112, cause the one or more processors 112 to instantiate an encoder 115 in the RTE 114. The encoder 115 includes a key item encoder neural network 115k and an interpolation item encoder neural network 115i for encoding data items selected from the sequence of correlated data items to serve as key items and interpolation items respectively. The encoder 115 may include a motion and residual module 115m for deriving interpolation information in the input data space for encoding interpolation items, for example using optical flow analysis in the case of video data. The encoder 115 may also include a bandwidth allocation module 115b for determining the bandwidth to be allocated to the transmission of the interpolation items. The encoder 115 may also include an encoder control module 115c to control the operation of the key item encoder neural network 115k and an interpolation item encoder neural network 115i to encode data items from the correlated sequence for transmission. [0063] By implementing these component functional modules, the encoder 115 may be configurable by instructions stored in memory 113 and implemented in RTE 114 to carry out the runtime methods described in relation to Figure 3, Figures 6-9 and 11, and Figures 12-15 for encoding sequences of input data items x_1, x_2, x_3, … x_n from information source 111 to sequences of encoder output vectors z_1, z_2, z_3, … z_n being used for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the data items across a communication channel 120, the signal values provided from the encoder output vectors z_1, z_2, z_3, … z_n representing a transformed version of the input data items. [0064] In more detail, in embodiments, the encoder 115 is configured to receive the data items x_1, x_2, x_3, … x_n, in the example the video frames, as input vectors for providing to input layers of the key item encoder neural network 115k and/or the interpolation item encoder neural network 115i. Once the encoder 115 has encoded data items from the sequence of correlated data items into encoder output vectors z_1, z_2, z_3, … z_n, the encoder output vectors z_1, z_2, z_3, … z_n are passed to the carrier modulator 118. The encoder output vectors z_1, z_2, z_3, … z_n are used for providing values in a signal space for modulating, by the carrier modulator 118, a carrier signal for transmission of a transformed version of the data items across a communication channel 120. The carrier modulator 118 may operate to in use directly encode the in-phase (I) and quadrature (Q) components of one or more carriers or subcarriers with signal values provided to the carrier modulator 118 by the encoder 115 using an appropriate modulation technique to provide a channel input signal 118i for transmission by antenna 119 across the communication channel 120. Where multiple carriers or subcarriers are encoded simultaneously, a suitable multiplexing technique such as orthogonal Frequency-Division Multiplexing (OFDM) may be used. As shown in Figure 2, the carriers encoding the encoder output vectors z_1, z_2, z_3, … z_n in the channel input signal 118i are then transmitted by the antenna 119 onto the communication channel 120. In other embodiments, the encoder 115 may be configured to output encoder output vectors z_1, z_2, z_3, … z_n that may provide values defining a probability distribution sampleable to provide values in a signal space that represent in-phase and quadrature components for modulation of the carrier signal for transmission over the communication channel. For example, the key item encoder neural network 115k and/or the interpolation item encoder neural network 115i may be configured as variational autoencoders. [0065] The carrier modulator 118 and antenna 119 may be of conventional construction and may be configured to encode the carriers/subcarriers with signal values of complex IQ representations. The carrier modulator 118 may be configured to freely modulate the carriers/subcarriers with any IQ signal value within the signal space passed to it. [0066] In other embodiments, the encoder output vectors provide values in the signal space that are assigned exactly to symbol values of a predetermined finite set S of symbols of a predefined alphabet of carrier modulation signal values transmittable by the transmitter over the communication channel. In embodiments, the predefined alphabet is a fixed, predefined constellation of symbols for digitally modulating the carrier signal to encode the input data for transmission over the communication channel. For example, the carrier modulator 118 may be configured to only be able to modulate the carriers/subcarriers with IQ values corresponding to one or more finite, fixed sets or ‘constellations’ of symbols such as by quadrature amplitude modulation (QAM) or binary phase-shift keying (BPSK). For example, the carrier modulator 118 and antenna 119 may be compatible with the 5G New Radio standard such that the transmittable symbols of IQ values are mapped to the 16-QAM, 64-QAM or 256-QAM constellations. The carrier modulator 118 and antenna 119 may be hard-wired to work only with these symbols, and they may not be able to transmit signal values or symbols that are not within these standard constellation sets. The encoder 115 may be configured to learn the optimum encoding within the available constellation of transmittable IQ signal values. [0067] The communication channel 120 may be used to convey information from one or more such transmitters 110 to one or more such receivers 130. The communication channel 120 may be a physical connection, e.g., a wire, or a wireless connection such as a radio channel as in the example shown in Figure 1. There is an upper limit to the performance of a communication system 100 which depends on the system specified. In addition, there is also a specific upper limit for all communication systems which no system can exceed. This fundamental upper limit is an upper bound to the maximum achievable rate of reliable communication over a noisy channel and is known as Shannon’s capacity. [0068] The communication channel 120, including the noise associated with such a channel, is modelled and defined by its characteristics and statistical properties. Channel characteristics can be identified by comparing the input and output of the channel, the output of which is likely to be a randomly distorted version of the input. The distortion indicates channel statistics such as additive noise, or other imperfections in the communication medium such as fading or synchronization errors between the transmitter 110 and the receiver 130. Channel characteristics include the distribution model of the channel noise, slow fading and fast fading. Common channel models include binary symmetric channel and additive white Gaussian noise (AWGN) channel. [0069] The receiver 130 includes at least one processor 132, memory 133 and a carrier demodulator 138 coupled to an antenna 139 for receiving data over communication channel 120. A bus system (not shown) may be provided which supports communication between at the least one processor 132, memory 133, carrier demodulator 138 and antenna 139. The receiver 130 thus includes an information sink 131 to which the reconstructed representation of the input data decoded by the decoder neural network 135 is provided. [0070] Similarly to the processor 112 and memory 113 of the transmitter 110, in the receiver 130, the processor 132 executes instructions that can be loaded into memory 133, and in use provide a Runtime Environment (RTE) 134 in which instructions or code loaded into the memory 133 can be executed by the processor to generate instances of software modules in the Runtime Environment 134. The memory 133 comprises instructions which, when executed by the one or more processors 132, cause the one or more processors 132 to instantiate a decoder 135. [0071] The antenna 139 of the receiver 130 receives as a channel output 138o from the communications channel 120 a noise-affected version

of the encoder output vectors z_1, z_2, z_3, … z_n transmitted by the antenna 119 of the transmitter 110, the noise having been added by the communication channel 120. The carrier demodulator 138 demodulates these noisy demodulated versions

of the encoder output vectors z_1, z_2, z_3, … z_n, for example, by coherent demodulation, and passes them to the decoder 135 in the RTE 134. These noisy demodulated versions

of the encoder output vectors z_1, z_2, z_3, … z_n are then mapped by the decoder 135 to a reconstructed representation

of the originally input sequence of data items x_1, x_2, x_3, … x_n where they are passed to the information sink 131 at which a reconstruction of the information source 111 is collected for viewing, storing or conveying further. As can be seen in Figure 2, the information sink 131 collects a decoded reconstruction of the video frames x_1, x_2, x_3, … x_n passed to the encoder 115 and

transmitting over the communications channel 120. [0072] In detail, the decoder 135 includes a key item decoder neural network 135k and an interpolation item decoder neural network 135i for decoding data items indicated as key items and interpolation items respectively. The decoder 135 may include a motion and residual module 135m for reconstructing interpolation data items in the input data space using decoded interpolation information provided by interpolation item decoder neural network 135i. The decoder 135 may also include a decoder control module 135c to control the operation of the key item decoder neural network 135k and an interpolation item decoder neural network 135i to decode data items from the correlated sequence for provision to information sink 131 at which the reconstruction representation of the input data items from the information source 111 is collected. [0073] By implementing these component functional modules, the decoder 135 may be configurable by instructions stored in memory 133 and implemented in RTE 134 to carry out the runtime methods described in relation to Figure 4, Figures 6-8, 10 and 11, and Figures 12-14 and 16 for decoding sequences of noise-affected version

of the encoder output vectors z_1, z_2, z_3, … z_n received over the communications channel 120 to a reconstructed representation

of the originally input sequence of data items x_1, x_2, x_3, … x_n. [0074] The encoder control module 115c may be configured for passing data items from the information source 111 to the key item encoder neural network 115k or the interpolation item encoder neural network 115i for encoding. The encoder control module 115c may be configurable to operate as static control module, which may pass data items from the information source 111 to the key item encoder neural network 115k or the interpolation item encoder neural network 115i based on a fixed order specified by a predetermined group of items, such as a fixed group of pictures for a video encoding scheme (e.g. every 7^th item may be a key item, with the intervening items all being interpolation items). The encoder control module 115c may also be configurable to operate instead as a dynamic control module, which may pass data items from the information source 111 to the key item encoder neural network 115k or the interpolation item encoder neural network 115i based on a dynamically assigned order specified by, for example, a decision agent implemented as a Markov Decision Process. [0075] The decoder control module 135c may be configured for passing the noise-affected version

of the encoder output vectors z_1, z_2, z_3, … z_n received over the communications channel 120 to either the key item decoder neural network 135k or the interpolation item decoder neural network 135i for decoding based on whether the respective data item is indicated as a key item or an interpolation item. Where a fixed group of items structure is used, the passing of the noise-affected version of the encoder output

vectors z_1, z_2, z_3, … z_n received over the communications channel 120 may be based on a fixed order specified by a predetermined group of items. Where a dynamic group of items is used, the receiver 130 may receive separate signalling from the transmitter 110 indicating the sequence of key items and interpolation items based on the dynamic operation of the encoder control module 115c. [0076] The key item encoder neural network 115k and key item decoder neural network 135k are formed as a complementary pair which may be configured as an autoencoder for encoding and decoding data items in the sequence selected as key items independent of any other data item in the sequence. That is, the key item encoder neural network 115k is for encoding data items selected from the sequence to serve as key items that can be directly reconstructed by key item decoder neural network 135k, the encoding and decoding of key items being independent of any other data item in the sequence. [0077] Similarly, the interpolation item encoder neural network 115i and the interpolation item decoder neural network 135i are formed as a complementary pair which may be configured as an autoencoder, a recurrent neural network, a long short-term memory, or any other suitable neural network configuration, for encoding and decoding data items in the sequence selected as interpolation items by interpolation at least in relation to a previous data item in the sequence. That is, the interpolation item encoder neural network 115i is for encoding data items selected from the sequence to serve as interpolation items that can be reconstructed by the interpolation item decoder neural network 135i, and other components of the decoder 135 as needed, using interpolation, the encoding and decoding of interpolation items using data representing the input data item and at least one previous data item in the sequence. [0078] Neural networks are machine learning models that employ multiple layers of nonlinear units (known as artificial “neurons”) to generate an output from an input. Neural networks may be composed of several layers, each layer formed from nodes. Neural networks can have one or more hidden layers in addition to the input layer and the output layer. The output of each layer is used as the input to the next layer (the next hidden layer or the output layer) in the network. Each layer generates an output from its input using a set of parameters, which are optimized during the training stage. For example, each layer comprises a set of nodes, the nodes having learnable biases and their inputs having learnable weights. Learning algorithms can automatically tune the weights and biases of nodes of a neural network to optimise the output in order to minimise an objective function using an optimisation algorithm such as gradient descent or stochastic gradient descent. [0079] The key item encoder neural network 115k has an input layer having nodes for receiving input data x_1, x_2, x_3, … x_n for encoding representative of input data items from the information source. [0080] The interpolation item encoder neural network 115i also has an input layer having nodes for receiving input data for encoding. The data input to the input layer of the interpolation item encoder neural network 115i depends on the neural network architecture, and whether the interpolation item encoder neural network 115i encodes interpolation information in the input data space or the latent space. [0081] That is, where the interpolation happens in the input data space, the input layer of the interpolation item encoder neural network 115i may receive input data x_n for encoding representative of input data items from the information source (relating to the current and at least the previous input data item, i.e. x_n and x_n-1), as well as motion and residual information generated by the motion and residual module 115m. In other arrangements, where the interpolation happens in the latent space, the input layer of the interpolation item encoder neural network 115i may receive the output vector z_n of the key item encoder neural network 115k for current and at least the previous input data item in the sequence (i.e. z_n and z_n-1). [0082] The key item encoder neural network 115k and interpolation item encoder neural network 115i have respective encoder output layers that output encoder output vectors z_1, z_2, z_3, … z_n that are used for providing values in a signal space for modulating, by the carrier modulator 118, a carrier signal for transmission by antenna 119 over communications channel 120. [0083] The key item encoder neural network 115k and interpolation item encoder neural network 115i have hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in the encoder input layer thereof to the encoder output vectors such that the transmitter 110 transmits a transformed version z_1, z_2, z_3, … z_n of the input data items x_1, x_2, x_3, … x_n across the communication channel 120. [0084] Similarly, the key item decoder neural network 135k and interpolation item decoder neural network 135i have hidden layers of connecting nodes with weights that, in use, map vectors of values

received at nodes in a decoder input layer thereof to decoder output vectors

provided at nodes of an decoder output layer thereof, the decoder output vectors

providing a reconstruction of the encoder input vector to generate a representation of the input data item. [0085] In accordance with the present disclosure, the connecting node weights of the key item encoder neural network 115k and interpolation item encoder neural network 115i have been trained together with the respective complementary key item decoder neural network 135k and interpolation item decoder neural network 135i, to minimise an objective function characterising a reconstruction error between input-output pairs of training data items. The training of the connecting node weights may be performed using an appropriate optimization algorithm operating on the objective function. [0086] In this way, the input data from the information source 111 (such as the image or video) encoded and transmitted by the transmitter 110 can be received and decoded at the receiver 130 to allow a reconstructed representation of the original input image or video to be generated at information sink 131. [0087] The key item encoder neural network 115k and interpolation item encoder neural network 115i may be configured to create a coded and compressed representation of the input data as an encoder output vector for transmission across the communications channel. For example, the encoder output vectors may contain less information than the encoder input vectors for each data item. Further, the encoder 115 may be configured such that the bandwidth allocation module 115b may interoperate with the key item encoder neural network 115k, interpolation item encoder neural network 115i and encoder control module 115c to encode the data items selected as interpolation items in such a way that an available bandwidth in the communication channel 120, or data budget, is shared between the data items so that the data items are compressed, for example such that a channel use constraint is met. In embodiments, the bandwidth allocation module 115b may be provided by a neural network configured, for example by reinforcement learning, to select, for each interpolation item in a group of pictures, a number of blocks of the interpolation item encoder output vector for transmission, where the interpolation item encoder neural network 115i has been trained to encode increasing information with an increased number of encoder output vector blocks. In other embodiments, where the interpolation item encoder neural network 115i is provided as a recurrent neural network (RNN), such as a long short-term memory (LSTM), that encodes successive interpolation items into a internal cell state thereof for transmission as an encoder output vector, the bandwidth allocation module 115b may be configured work with the encoder control module 115c to dynamically select successive data items as interpolation items to be encoded together into the same recurrently updated encoder output vector for transmission, such that successive interpolation items are encoded together in a compressed transmission for recurrent decoding. Similarly then, the decoder may be configured to decode the noise- affected version

of the encoder output vector back into an uncompressed reconstruction

of the input data. In embodiments, the quantity of information included in the output vector of the key item encoder neural network and/or the interpolation item encoder neural network is smaller than the input vector. [0088] Although the examples described above disclose the output of the encoder 115 being a vector of signal values in IQ space transmittable by the transmitter over the communication channel 120, in other embodiments the communication channel 120 may be one that also includes an existing channel encoder and decoder scheme, in which the signal space of the channel input may be the predetermined finite set symbols of the channel code (which could be bit values) for modulating a signal carrier for providing input signals in the alphabet in the input signal space for the communication channel 120. Thus besides random noise applied by the communication channel, the transformation applied by the communication channel 120 may, in embodiments, also include an existing channel code. Thus in these embodiments, the signal space may be a message alphabet for an existing channel code by which the input signals to the given communications channel are modulated. In this case, the encoder output vector will be mapped into the message alphabet of the corresponding channel code (rather than, for example, the raw IQ values transmittable by the transmitted). The noise-affected version of the encoder input vector input to the decoder neural network 135 may correspond to the hard- decoded message of the existing channel decoder. In this respect, in these embodiments, the encoder neural network 115 and decoder neural network 135 may learn an optimum mapping of the input information source 111 to inputs of an existing channel code of the communications channel 120 that reduces reconstruction errors at the output 131 of the decoder neural network135. Although acting as an outer code in these embodiments, this learned coding of the encoder neural network 115 and decoder neural network 135 is still optimised based on the characteristics of the communication channel 120 to reduce reconstruction errors, even though in these alternative embodiments the communication channel 120 includes an existing channel code. This is unlike existing modular source codes which are defined independently of the random transformation applied by any channel. [0089] Reference will now be made to Figures 3 and 4 which set out in more detail how the transmitter 110 and receiver 130, and the trained neural networks of the encoder 115 and decoder 135, operate to transmit data from information source 111 across communication channel 120 by joint source and channel coding. [0090] As shown in Figure 3, the run time method 300 for the transmitter 110 and the encoder 115 starts in step 301 in which a sequence of correlated data items for intervals t = 1, 2, 3 … n, such as frames of a video, is received from an information source 111 for passing to the key item encoder neural network 115k or interpolation item encoder neural network 115i as encoder input vectors x_1, x_2, x_3, … x_n for encoding. In examples, the data items may be received at the encoder control module 115c which may select each data items in the sequence as a key item or an interpolation item, according to a static or dynamic allocation of a group of items, and then passing to the key item encoder neural network 115k or interpolation item encoder neural network 115i as encoder input vectors x_1, x_2, x_3, … x_n for processing accordingly. [0091] In embodiments, the encoder control module 115c may be a static control module configured to select data items from the sequence of data items as key items and interpolation items according to a fixed order specified by a predetermined group of items. [0092] Alternatively, or in addition, the encoder control module 115c may be a dynamic control module having a dynamic decision agent configured to dynamically choose whether the input data item x_t is to serve as a key item or an interpolation item. The dynamic decision agent may be configured to dynamically choose whether the input data item x_t is to serve as a key item or an interpolation item based at least on one or more of: the current data item x_t; the number of data items transmitted since last key item; a current average channel utilisation; and a channel utilisation constraint. The dynamic decision agent may be configured to dynamically choose whether the input data item x_t is to serve as a key item or an interpolation item so that the average channel utilisation is below the channel utilisation constraint. To facilitate decoding by the decoder 135 where the ordering of key and interpolation items is not fixed, and is instead dynamically assigned, the dynamic decision agent may be configured to generate data mapping, for the sequence of data items, which data items are key data items and which data items are interpolation data items. This mapping is for transmission across the communications channel 120 and for use by the decoder 135 to determine whether the received noise-affected version of an encoder output vector z_t should be decoded by the key item decoder neural network 135k or the interpolation item decoder neural network 135i. [0093] Thus, in step 302, the encoder 115, or in embodiments more specifically the encoder control module 115c, may determine, statically or dynamically, whether the data item x_t is selected as a key item or an interpolation item. [0094] If the data item is selected as a key item, in step 303, the data item x_t is passed to the key item encoder neural network 115k where it is encoded to a latent encoder output vector z_t based on the input data item x_t and being independent of any other data item in the sequence. As a key item, the data item x_t can be directly reconstructed by the decoder from the noise- affected version

of the encoder output vector z_t alone. [0095] If the data item is selected as an interpolation item, in step 304, the interpolation item encoder neural network 115i is used to encode to a latent encoder output vector z_t representative of the input data item x_t. The encoding by the interpolation item encoder neural network 115i may be performed using data representing the input data item x_t and at least one previous data item in the sequence x_t-1. The encoding by the interpolation item encoder neural network 115i may be performed also using data representing the input data item x_t and at least one subsequent data item in the sequence x_t+1. In this respect, the input data item x_t, and other data items used in the encoding the latent encoder output vector z_t by the interpolation item encoder neural network 115i, may be pre-processed before being passed to the interpolation item encoder neural network 115i to provide representative data to facilitate the encoding of interpolation information to allow the reconstruction of the input data item x_t at the decoder by interpolation from reconstructions of representations of other data items in the sequence. [0096] For example, before being passed to the interpolation item encoder neural network 115i, if the interpolation information for the data item x_t is being evaluated in the input data space, the input data item x_t may be pre-processed by a motion and residual module 115m of the encoder 115 to generate one or more of: motion representation information representing the relative motion between the data item and at least one other data item in the sequence; and the residual information between the data item and a motion compensated version of at least one other data item in the sequence using the motion representation information in respect of that data item. In this respect, the encoding of a latent encoder output vector z_t for interpolation items by the interpolation item encoder neural network 115i may use data received at the input layer thereof representing the input data item in input data space and at least one previous data item in the sequence in input data space. For video frames, the input data space is the pixel space of the video. The motion representation information may therefore be optical flow information for the input video frame x_t produced from an optical flow analysis of the sequence of video frames. [0097] In other examples, before being passed to the interpolation item encoder neural network 115i, if the interpolation information for the data item x_t is being evaluated in the input data space, the input data item x_t may be pre-processed by the key item encoder neural network 115k to encode the data item x_t into a latent space vector z_t. In this respect, the input layer of the interpolation item encoder neural network 115i may be configured such that the interpolation item encoding uses data representing: the input data item in the latent space defined by the output of the key item encoder neural network 115k (i.e. the vector z_t output therefrom), and at least one previous data item in the sequence in the latent space defined by the output of the key item encoder neural network 115k (i.e. (i.e. the vector z_t-1 output therefrom for the previous data item x_t-1). The output of the interpolation item encoder neural network 115i may also be a vector in the latent space z, the vector z_t being representative of interpolation information in the latent space z for the data item x_t. Encoding the interpolation information in the latent space z in this way may be more efficient and effective than encoding the interpolation information in the input data space of x. [0098] Once the data item x_t is encoded by the key item encoder neural network 115k or the interpolation item encoder neural network 115i into a latent space vector z_t it is passed in step 305 to the carrier modulator 118. The latent space vector z_t has values usable for providing values in a signal space for modulating, by the carrier modulator 118, a carrier signal or one or more subcarriers for with a transformed version of the data item x_t. Once the carrier signal has been encoded it is transmitted across communication channel 120 using antenna 119. [0099] For the latent space vectors z_1,2,3,…n provided by the interpolation item encoder neural network 115i, in embodiments where the encoder output layer of the interpolation item encoder neural network 115i is divided into blocks and is trained to encode descending ordering of information in increasing blocks of output nodes, and with increasing blocks in the latent space vector, the bandwidth allocation module 115b may control the number of output blocks of each latent space vector z_t that are transmitted to share out the available bandwidth or bit budget, for example to meet an average channel use condition, or based on an allocation of bandwidth for the transmission. In embodiments, the bandwidth allocation module 115b may be configured to determine the number of blocks of the interpolation encoder output layer to be transmitted to minimise the reconstruction error between the representation of the input data reconstructed at the decoder and the input data encoded at the encoder. In embodiments, the bandwidth allocation module 115b may be configured to determine the number of blocks of the interpolation encoder output layer to be transmitted based on at least motion representation information determined to represent the relative motion between the data item x_t and at least one other data item in the sequence (e.g. x_t-1). In embodiments, the bandwidth allocation module is configured to determine a number of blocks of the interpolation encoder output layer to be transmitted, so as to seek to optimally allocate the available bandwidth in the communications channel between a group, a set or the whole sequence of data items to be transmitted (e.g. x_1, x_2, x_3, … x_n). [00100] On the other hand, in embodiments where the interpolation item encoder neural network 115i is provided by a recurrent neural network (RNN) such as an LSTM, the bandwidth allocation module 115b may work together with the encoder control module 115c to encode successive interpolation items into the cell of the RNN for transmission in a single latent space vector z_t for recursive decoding by the decoder 135. That is, the encoder neural network 115i is operated by the bandwidth allocation module 115b working together with the encoder control module 115c to maintain and update an internal cell state thereof as successive interpolation items of a group of consecutive interpolation items are encoded by the interpolation encoder neural network 115i. After successive interpolation items have been encoded into the internal state, the encoder 115 is configured to provide the internal state as the interpolation encoder output vector z_t for providing values in a signal space for modulating the carrier signal for transmission of a transformed version of the group of consecutive interpolation data items across the communication channel 120. That is, the encoder 115 is configured to output an encoder output vector z_t for transmission for each key item and each group of consecutive interpolation items between key items. In this way, an available bandwidth or bit budget can be shared between the input data items, for example based on a dynamic decision in a Markov Decision Process of whether or not the current data item should be a key item or an interpolation item. [00101] The transmission by the transmitter 110 of the carrier signals modulated using the latent space vectors z_1,2,3,…n may be in sequence as each latent space vector is encoded for each time step as the data item x_t for that time step is generate or received, for example on the fly in the event of streaming data. On the other hand, where the data items x_1,2,3,…n are previously generated and stored in a static media file, the latent space vectors z_1,2,3,…n may all be encoded first before being transmitted in sequence by the transmitter. [00102] After the signal values from the encoded latent space vectors z_1,2,3,…n have all been transmitted, the transmitter process 300 is completed. [00103] Turning now to Figure 4, the run time method 400 for the receiver 130 and the decoder 135 starts in step 401 in which the antenna 139 receives a carrier signal from communications channel 120 and passes it to carrier demodulator 138 which demodulates the carrier signal to recover noise-affected version

of encoder output vector z_1,2,3,…n as they are received, and passes them to the decoder control module 135c. [00104] In step 402, the decoder control module 135c determines whether the noise-affected version of encoder output vector z_t for a given time step t encodes the input data item x_t for that time step as a key item or an interpolation item. In embodiments where the ordering of the group of items follows is static and follows a pre-defined order, the decoder control module 135c may determine whether the received

̂ is representative of a key item or an interpolation item based on that order. In embodiments where the ordering of the data items as key or interpolation items is assigned dynamically, the decoder control module 135c may determine whether the received

is representative of a key item or an interpolation item based on mapping data received from the encoder 110. [00105] If, in step 402, the noise-affected version

of encoder output vector z_t is indicated as a key item, in step 403, the vector

is passed to the key item decoder neural network 135k where it is directly decoded to provide a reconstruction

of the input vector x_t independently of any other data item in the sequence. In this way, the key item decoder neural network 135k generates a representation of the input data items of x_1,2,3,…n indicated as key items directly from the relevant noise-affected version of the encoder input vectors received at

the receiver. [00106] If, in step 402, the noise-affected version

of encoder output vector z_t is indicated as an interpolation item, in step 404, the vector

is passed to the interpolation item decoder neural network 135i where it is decoded to provide a reconstruction

of the input vector x_t based on data representing at least one previous data item x_t-1 in the sequence and the noise- affected version of the encoder output vector z_t for the data item.

[00107] To provide a reconstruction

of the input vector x_t , in embodiments where the interpolation is performed in the input data space, the data representing at least one previous data item x_t-1 in the sequence used by the interpolation item decoder neural network 135i to decode interpolation items may comprise a reconstruction

of the encoder input vector providing a representation of the input data item x_t-1 for at least the previous data item in the sequence. In embodiments, the interpolation item decoder neural network 135i decodes the noise-affected version of the encoder output vector z_t to provide an estimate of the motion

representation information representing the relative motion between the data item x_t and the at least one other data item in the sequence (e.g. x_t-1) generated at the motion and residual module 115m and encoded by the interpolation item encoder neural network 115i. In embodiments, the interpolation item decoder neural network 135i decodes the noise-affected version of the encoder output vector z_t to provide an estimate of the residual information between the data item x_t and a motion compensated version of the at least one other data item in the sequence (e.g. x_t-1) using the motion representation information in respect of that data item generated at the motion and residual module 115m and encoded by the interpolation item encoder neural network 115i. [00108] To provide a reconstruction of the input vector x_t , in embodiments where the

interpolation is performed in the latent space, the data representing at least one previous data item x_t-1 in the sequence used by the interpolation item decoder neural network 135i to decode interpolation items may comprise a noise-affected version of an encoder output vector In embodiments, the interpolation item decoder neural network 135i provides a

reconstruction

of the input vector x_t by decodes the noise-affected version

of the encoder output vector z_t based on the vector and the vector for the previous data item. That is,

the vector and

are used recursively as inputs by the recurrent neural network to update the cell state and provide the reconstruction as an output. Specifically, the vector

corresponds to a representation of the previous data item x_t-1 in the latent space as represented by an encoding of the reconstruction

using the key item encoder neural network 115k. For this purpose, the decoder 135 may store a software module in memory 133 for instantiating the key item encoder neural network 115k locally in RTE 134. [00109] In embodiments, the reconstruction

of the encoder input vector providing a representation of the input data item x_t-1 is obtained at the [00110] Once the reconstruction

of the encoder input vector x_t is decoded by the key item decoder neural network 135k or the interpolation item decoder neural network 135i it is passed in step 405 to the information sink 131 at which the information source is reconstructed. The reconstruction of the information source generated in the information sink 131 may be stored locally, for example in memory 133, for local reproduction at a later stage, or it may be reproduced contemporaneously without being stored permanently locally (for example in the case of streaming media). The reconstruction of the information source generated in the information sink 131 may also be conveyed onward for reproduction elsewhere, for example using the Internet. [00111] Regarding training the neural networks, a training time process 500 for optimising the weights of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, to minimise reconstruction errors will now be described with reference to Figure 5. [00112] Once all the neural networks of the communication system 100 have been designed and initialised with suitable initial encoder and decoder weights and parameters, the weights of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, are jointly optimized end-to-end in an unsupervised manner by passing training data sample vectors

as inputs through the communication system 100 (or a simulation thereof using a channel model to add noise) and receiving its reconstruction vector

in a forward pass of training data through the neural networks. That is, in step 501 the vectors

is received (individually or in batches) and are passed through the communication system to obtain the reconstruction vectors

to form input- output pairs of a set of training data in respect of a training data information source 111. [00113] In examples, in the training phase, the input-output pairs of vectors

of training data may be calculated empirically, by the transmitter 110, in the forward pass, encoding and transmitting the encoder output vector

representation of the input vector

of training data across the communication channel 120 where the signal values are subsequently received as the noise-affected vector

decoded by receiver 130 to the reconstruction vector

In this way, the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, can be optimised to take into account the noise in the channel through training based on empirical data capturing the effects of channel noise on the transmission. [00114] In other examples, as shown in the Figure 5 example, in the training phase, the input- output pairs of vectors

of training data may be generated using a model of the communication channel 120 to estimate channel noise and add it to the transmitted encoder output vector to generate simulation of the noise-affected vector

subsequent decoding and reconstructing of the output training data vector by the decoder neural networks 135k and

135i. In these examples, a channel model can be adopted that simulates the practical channel experienced in the operational regime. For simplicity, a complex additive white Gaussian noise (AWGN) channel model can be adopted, which produces the channel output where

is a vector containing elements drawn from zero-mean Gaussian distribution of variance σ². However, in general, the channel model can be any model that simulates an arbitrary transformation of the encoder output vector

transmitted by the transmitter 110. [00115] The training process may perform batchwise optimisation across groups of input- output pairs, such as using gradient descent to find a gradient error in the forward pass and determine an update to the weights. In other examples, stochastic gradient descent may be used in which the error is determined and weights updated for each input-output pair of vectors

of training data, before the next of pair of vectors of the training data are determined

using the updated weights. [00116] When a batch or a single input-output pair of vectors

of training data have been received to optimise and update the weights in the training process, in step 503, an objective function is determined characterising a reconstruction error between the input-output pairs of vectors of training data. In the example, as shown in Figure 5, the reconstruction error for

the objective function is characterised using the Mean Squared Error loss between

calculated by:

[00117] Other objective functions characterising the reconstruction error may be used. [00118] Once the objective function has been calculated for the input-output pair of vectors

of training data, the method further comprises, in steps 505 and 507 which may be performed together, using an appropriate optimisation algorithm operating on the objective function, updating the connecting node weights of the hidden layers of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, to seek to minimise the objective function. [00119] In step 505, the gradient descent optimisation algorithm is used to seek to minimize the objective function by using a differential of the objective function to determine the gradient and the direction towards a minimum value for the objective function. Thus, in a backward pass through the communication system, the gradient descent algorithm operates on the objective function based on a differential of at least the key item encoder and decoder neural network pair, 115k and 135k, for training data items that are key items, and the interpolation item encoder and decoder neural network pair, 115i and 135i, for training data items that are interpolation items. For example, using backpropagation, the gradient of the objective function can be efficiently calculated with respect to the weights in the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, for example by unstacking the elementary functions used to compute the forward pass, and by repeatedly applying the chain rule to autodifferentiate them and determine the gradient with respect to the weights in the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, by backpropagation. [00120] Once the gradient of the object function is determined with respect to the weights in the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, for the training data, in step 507, the connecting node weights of the hidden layers of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, are updated to seek to minimise the objective function. In examples, this is achieved in the gradient descent optimisation method by the using the determined gradient to estimate an update to the weights of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, that is expected to step the objective function towards a minimum, where the local gradient is zero. [00121] Once the estimate of the update to the weights is determined by an optimisation method, the weights of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, are updated and, in step 509, it is checked whether there are more samples of training data in the training set. If there are, the process 500 returns to step 501 and the next batch or training sample is received and the optimisation method is carried out again to further optimise the weights of the neural networks. If training over the training set is complete, the process 500 ends and a trained key item encoder and decoder neural network pair, 115k and 135k, and interpolation item encoder and decoder neural network pair, 115i and 135i, are provided for use in an operational communication system 100 for transmitting input data over a communication channel 120. [00122] For the optimization process, as the encoder and decoder blocks are built as artificial neural networks with learnable parameters so that the transformation from/to data to latent representation (code) can be learned directly from data. If the constellation symbols ^ transmittable by the transmitter are predefined, as it is the case when using standard communication hardware and protocols, the or if an existing channel code is used, these pre- existing codes act as constraints for the optimization and the objective function. [00123] If a channel model is used in the forward pass of the training process, rather than empirically generating training data, the channel model can be included directly in the backward pass in the optimisation algorithm. If the channel model used is differentiable, it can be used directly in the backpropagation stage. If it is not differentiable, a generative adversarial network (GAN) may be used to learn a differentiable representation of the channel model. In this way, the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, can be optimised to take into account the noise in the channel through training based on a theoretical noise model of the communication channel. Thus the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, can be trained together using training data in which a model of the communication channel is used to estimate channel noise and add it to the transmitted signal values

to generate the noise- affected version

of the vector of signal values in the input-output pairs of training data. [00124] It should be noted that, in other examples, the objective function may characterise and optimise against further constraints and characteristics of the communication system 100, such as to obtain an average power in the symbols transmitted across the communication system 100, so as to ensure the learned coding system satisfies an average power constraint. [00125] For example, where the interpolation item encoder and decoder neural network pair, 115i and 135i, are to encode descending ordering of information in increasing blocks of nodes, such that, for interpolation items, the decoder reconstructs an increasingly refined representation of the input data with increasing blocks received in the noise-affected version of an encoder output vector, the training process 500 can be adapted. Here, the encoder output layers of the interpolation item encoder neural network 115i and the decoder input layers of the interpolation item decoder neural network 135i are divided into ordered blocks. At training time the encoder output vector

passed to the decoder input layer for each input-output pair during training is selected to have a random number of the ordered blocks, such that, following training, the interpolation item encoder neural network 115i encodes descending ordering of information in increasing blocks of nodes, and such that, for interpolation items, the decoder 135 reconstructs an increasingly refined representation of the input data with increasing blocks received in the noise-affected version of the encoder output vector.

[00126] Referring now to Figures 6, 7, 8, 9, 10 and 11, a detailed example embodiment of a communication system 100 for transmitting video across a communications channel will now be described. The communication system 100 in this embodiment operates a static control module allocating data items according to a fixed group of pictures. The communication system 100 in this embodiment determines interpolation information for encoding interpolation data items in a pixel space of the input data items. Further still, the communication system 100 in this embodiment includes an interpolation encoder neural network 115b having an output layer of nodes divided into blocks such that it encodes descending ordering of information in increasing blocks of nodes, and a bandwidth allocation module 115b for selecting a number of blocks of an encoder output vector for interpolation items to share or allocate bandwidth between interpolation items in the group of pictures. As will be seen, this arrangement has been shown to outperform existing separate source coding and channel coding schemes in terms of reduced reconstruction errors at the decoder across a wide range of different channel conditions. [00127] It should be noted that, in relation to Figures 6, 7, 8, 9, 10 and 11, features with like reference numbers represent the same features as those described in relation to Figures 1 and 2, and should be understood accordingly, and so a detailed description of those features may be omitted in the following. [00128] Consider a group of N frames

from a video sequence. This is called a group of pictures (GoP) and X_n refers to the nth GoP in a video sequence received in step 901 from an information source. A static control module 115c in step 902 determines that theﬁrst

and last

frames are the key frames in the fixed GoP structure, and these are compressed and transmittedﬁrst using key item encoder neural network 115k. [00129] The key item encoder neural network 115k, parameterised by θ, and mapping a frame to a complex latent vector

representing the In-phase (I) and Quadrature (Q) components of a complex channel symbol, is then deﬁned as:

[00130] This is achieved by pairing consecutive real values at the output of the neural network. [00131] The values in the complex latent vector

mayﬁrst be power normalised to meet a power constraint, and in step 904 these values are then directly sent through the communication channel 120. [00132] To simulate the channel 120 during training, an Additive White Gaussian Noise (AWGN) channel model is used, deﬁned as:

where

is the Complex Gaussian distribution with zero mean and covariance matrix

being the identity matrix). Consequently, the key item decoder neural network 135k parameterised by

that maps the noisy latent vector

received and demodulated at the receiver in step 1001 and passed to the key item decoder neural network 135k in step 1002 to decode it back to the original frame domain

in step 1003 is deﬁned as:

[00133] The key item encoder neural network 115k and key item decoder neural network 135k are then trained together using the method generally described in relation to Figure 5 using the mean-squared error as the loss function, deﬁned as follows, to optimise the weights of the hidden layers thereof to minimise a reconstruction error.

[00134] A diagram of the key item encoder neural network 115k and key item decoder neural network 135k architecture is shown in Figure 7. The notation kxsycz is used to signify kernel size x, stride y and z kernels. The GDN layer refers to Generalised Divisive Normalisation, which is effective in density modelling and compression of images. The network is fully convolutional, therefore it can accept input of any height (H) and width (W). [00135] For the interpolation frames in between

are passed in step 902 the separate interpolation item encoder neural network 115i specified as

is used to encode motion representation and residual information determined in step 905 by motion and residual module 115m. The architecture of the interpolation item encoder neural network 115i and interpolation item decoder neural network 135i is shown in Figure 8. [00136] The motion representation is generated by motion and residual module 115m in respect of two other frames in the sequence by an optical flow estimator to generate the optical flow and residual information with respect to two frames

referred to as anchor frames. The anchor frames may be the previous and next frames in the sequence, such that t = 1. For this, let

be the opticalﬂow vectors that represent the motion information from frame

, and likewise for

The motion and residual module 115m then determines

as shown in Figure 8 to be a motion compensated anchor frame according to the opticalﬂow to produce an approximation

of frame using the determined optical flow. Then to determine the residual

for the

from the motion compensated anchor frame the motion and residual module 115m

determines the residual error in the optical flow interpolation:

[00137] The residual represents information not captured by opticalﬂow, such as occlusion/disocclusion and camera movements. To estimate the opticalﬂow, a pre-trained PWC-Net can be used in the motion and residual module 115m. [00138] Given all of this information, the interpolation item encoder neural network 115i

parameterised by φ- deﬁnes the mapping for interpolation data items into the latent space:

[00139] In step 906, the interpolation item encoder neural network 115i

is thus used to encode the data item

based on the data items optical flows and

residuals

In step 907, as described later, if the interpolation item encoder neural network 115i encodes descending ordering of information in increasing blocks of nodes, a bandwidth allocation module 115b may be used to select a number of blocks of an encoder output vector for the interpolation item to share or allocate bandwidth between interpolation items in the group of pictures. [00140] The values in the complex latent vector mayﬁrst be power normalised to meet a

power constraint, and in step 908, the transmitter transmits the signal values of encoder output vector over communications channel 120. [00141] The interpolation item decoder neural network 135i parameterised by

maps the noisy latent vector

received and demodulated at the receiver in step 1001 for the interpolation items and passed to the interpolation item decoder neural network 135k in step 1002 to decode it in step 1004 back to provide an estimate of the opticalﬂow, residual and a mask. That is, as can be seen in Figure 8, the interpolation item decoder neural network 135i parameterised by

defines the mapping:

where the mask and

a slice in the third dimension of

satisfies:

[00142] After the estimate of the opticalﬂow, residual and a mask are decoded, as can be seen in Figure 8 the decoder motion and residual module 135m reconstructs the frame by:

[00143] where ∗ refers to element-wise multiplication. The reconstructed frames

generated in step 1003 by the key item decoder neural network 135k and in steps 1004 and 1005 by the interpolation item decoder neural network 135i and decoder motion and residual module 135m are then passed to information sink 131 at which the reconstruction of the data source 111 is stored. Once all N frames of the GoP are reconstructed, the last frame

of the current GoP becomes theﬁrst key frame of the next GoP, and the same process is repeated.

[00144] The architecture of the interpolation item encoder neural network 115i and interpolation item decoder neural network 135i is functionally the same as for the key item encoder neural network 115k and key item decoder neural network 135k. Thus, similarly to the key item encoder and decoder pair as set out above, the interpolation item encoder neural network 115i is trained together with the interpolation item decoder neural network 135i using the

method generally described in relation to Figure 5 and the mean-squared error as the loss function, to optimise the weights of the hidden layers thereof to minimise a reconstruction error. Again, an AWGN noise model for the channel may be used. [00145] To allocate the available bandwidth, the bandwidth allocation module 115b is trained as a separate neural network parameterised by ψ having the architecture as shown in

Figure 11. Given a particular channel use constraint k per each GoP, reinforcement learning (RL) is utilised to learn the optimal bandwidth allocation policy for each frame in a GoP based on the frames themselves, that maximises the video quality. The bandwidth compression ratio is then defined as

[00146] As explained above in relation to Figure 5, the joint source-channel encoders having the neural network architecture of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, encoded video frames can be successively refined by sending increasingly more information.

This is achieved by dividing the latent vector into M equal sized blocks

while randomly varying the length L of the latent code selected for

transmission and decoding in each batch of input-output pairs

during training. This training process leads to the descending ordering of information from

to Thus, by selecting fewer blocks of the latent vector

for transmission, less

information encoding the input data item is transmitted, and the reconstruction of the data

item in at the decoder 130 includes less information. Similarly, by selecting more blocks of the latent vector for transmission, more information encoding the input data item is

transmitted, and the reconstruction of the data item in

at the decoder 130 includes more information. In this way, the number of blocks of the encoded latent vector

generated by the encoder neural networks 115 can be selected for transmission to vary the bandwidth used to transmit the encoding of each data item. This may be based on an assessment of a relative amount of information needed to minimise reconstruction errors while meeting a channel use condition or a bit budget for the GoP.

[00147] In the context of video interpolation, if the interpolation frame in consideration is

exactly the same with respect to the anchor frames that it is being interpolated from,

then no information needs to be transmitted as the pixels can simply be copied from the anchor frame to create the frame. On the other hand, if there is significant differences with respect to the anchor frames, then much more information will have to be sent in order to accurately interpolate the frame. Therefore, the neural network parameterised by ψ of the bandwidth

allocator module 115b is trained using reinforcement learning to allocate the available bandwidth to each of the frames in a GoP, using only the frames themselves, such that the

loss metric is minimised. [00148] That is, the nth GoP in a video is defined as where

[00149] The action set A consists of all the ways to allocate the available bandwidth k to each frame in the GoP. Since we are concerned with maximising the visual quality of theﬁnal video, we deﬁne [00150] the reward as

[00151] Deep Q-learning is used to learn the optimal allocation policy, where the network

seeks to approximate Here S represents the set of all states (i.e. all GoPs in a

video). The purpose of the Q function is to map each state and action pair to a Q value, which represents the total discounted reward from step n given the state and action pair The

Q function is defined as mapping:

where 0 ≤ γ ≤1 is the discount factor, which is chosen close to 1 when aiming to optimize the average reward. [00152] As indicated above, the purpose of the of deep neural network

of the bandwidth allocator module 115b is to approximate To that end, the mean-squared error loss

function L is used and gradient descent is performed to update the weights of the network as follows.

where

is the learning rate and B is a batch of data points containing sets of

[00153] The end-to-end objective to find the actions

selecting the optimum number of transmission blocks for each

for a GoP to minimise reconstruction errors for the available bandwidth can be separated into the following two optimisation problems:

and

where

and is the allocation of transmission blocks for each given to fr

ame

[00154] Upon initialisation of the communication system 100, the first frame

is sent using full bandwidth k, and thereafter, the neural network

of the bandwidth allocator module 115b is used to determine the optimal bandwidth allocation for the remaining N − 1 frames in the GoP such that the number of blocks of the encoder output vectors selected in step 907 for

transmission in step 908 is optimal. [00155] The performance of the joint source and channel coding communication system 100 described above in relation to Figures 6, 7, 8, 9, 10 and 11 has been measured for encoding, transmitting and reconstructing sequences of correlated data items at various contrast signal to noise (CSNR) ratios in a software-defined prototype of the communication system 100, and the results can be seen in Figure 17. In particular, the communication system 100 was trained optimally for encoding, transmitting and reconstructing sequences of correlated data items to minimise reconstruction errors at training CSNR levels of -5, 0, 5, 10 and 15 dB, and each trained model was then evaluated under test channel conditions of the same CSNR levels. [00156] Figure 17 plots the performance of each trained model for a bandwidth compression ratio of 0.031 at each evaluation CSNR as given, in the top pane (a), by the measured peak signal to noise (PSNR) ratio indicative of the reconstruction quality of the video frames at the receiver 130, and in the bottom pane (b), by the measured multiscale structural similarity index measure (MS-SSIM) indicative of the similarity at different scales between the video from the information source 111 and the video reconstructed at the information sink 131. MS-SSIM has been shown to perform better in approximating the human visual perception than the more simplistic structural similarity index (SSIM) on different subjective image and video databases [00157] As can be seen from Figure 17, the performance of the trained models in the communication system 100 improves as the training CSNR for the model increases and for each individual trained model, its performance also increases as the evaluation CSNR increases. [00158] This is expected as the channel noise corrupts the joint source and channel coding encoded symbols directly and the greater the noise power, the more difﬁcult it is for the decoder 135 to correctly decode the noisy symbols. Most importantly, as can be seen in Figure 17,the trained models of the communication system 100 do not suffer from the cliff effect. We can see that, as the evaluation CSNR decreases (indicating greater and greater noise in the channel), the performance of each model gracefully degrades, as opposed to the cliff edge drop off that separation-based coding designs suffer from. This is because, in the communication system 100, the channel noise is allowed to directly distort the information being transmitted, which allows the cliff-edge effect to be avoided. [00159] In contrast, the cliff effect can clearly be seen in Figures 18, which plots the envelope of the performance for the different trained joint source channel coding models of the communication system 100 shown in Figure 17 against the same measured performance metrics of a conventional separate source coding by H.264 (i.e. MPEG-4 Part 10) and channel coding by a low-density parity-check (LDPC) code at different code rates. The bandwidth compression rate for the separate source and channel coding models is chosen to be at a level the achieves equivalent performance to the best performing joint source and channel coding model at the highest evaluation CSER of 15dB, in order to compare the best achievable reconstruction performance by both joint- and separate- coding models as the channel noise increases and the evaluation CSNR decreases. However, it should be noted that the bandwidth compression rate for the separate source and channel coding models that achieves the same peak performance to the best joint source and channel coding model at an evaluation CSNR of 15dB is lower than the bandwidth compression rate similarly performing joint source and channel coding model. This indicates that, for the same peak performance, the joint source and channel coding model achieves a higher bandwidth compression ratio, meaning less data is transmitted to achieve the same reconstruction performance. [00160] Regarding the cliff effect for the separation-based schemes, as can be seen from Figure 18, at every LDPC code rate, there exists an evaluation CSNR at which the performance of the separation-based coding scheme deteriorates rapidly, and above which CSRN the performance does not improve. This is due to the fact that the LDPC code rate is insufﬁcient for the channel condition below that cliff edge evaluation CSNR threshold, and this would be observed as a signal drop out, with no received signal being decoded. This can cause significant problems when the channel conditions are variable giving poor reliability in transmission. Further, due to the pre-applied compression, the performance of the separation- coding does not improve as the channel condition improves above the cliff threshold CSNR either, meaning that, for better channel conditions above the cliff edge threshold, no improvement in quality is observable. As such, in separation coding, a cliff edge deterioration of the H.264 scheme is observed. This cliff edge in performance is simply not seen in the trained joint source channel coding models of the communication system 100 of the present disclosure. [00161] In fact, the overall performance of the trained joint source and channel coding models of the communication system 100 is better than the best available H.264 and LDPC codes, as the best performing trained joint source and channel coding model beats the best available separation-coding H.264 with LDPC coding scheme for all evaluation CSNR channel noise levels. This is the case in both the PSNR and MS-SSIM metrics, suggesting the superior compression capability of the communication system 100 over separation-based schemes. Overall, when assessed at the same bandwidth compression ratios the communication system 100 is 3.98dB and 6.07dB better on average in PSNR than H.264 with LDPC coding for ρ = 0:031 and ρ = 0:018, respectively. [00162] Thus from Figure 18 it can be seen that the trained joint source channel coding models of the communication system 100 consistently outperform the best performing conventional separation codes for reconstruction performance and compression rates, for all channel noise conditions. Further, because the encoder neural networks directly map the source inputs to the channel outputs, and the decoder neural networks directly map the noisy-received channel outputs to the reconstruction of the source inputs, the trained joint source channel coding models of the communication system 100 were consistently three orders of magnitude faster in terms of end-to-end encoding/decoding speed, compared to the best performing separate coding schemes, further reducing latency of transmission. [00163] In order to operate the joint source channel coding encoder and decoder at the performance envelope, the current channel condition needs to be monitored and the weights of the encoder and decoder neural networks need to be adjusted to the weights trained to match the channel condition. That is, weights are chosen to correspond to a training condition in which the channel noise or SNR matched the estimate of the current channel condition. In practice, to determine an accurate estimate (SNR_EST) of the current channel SNR that corresponds to the actual current channel SNR (SNR_AWGN), the selection of the trained weights is adjusted such that the performance of the joint source channel coding encoder and decoder meets the rate- distortion curve of the performance envelope as closely as possible (i.e. such that SNR_EST = SNR_AWGN is found). [00164] Figure 19 shows a performance comparison of the performance envelope of another example of the communication system of Figure 6 and the performance of separate source coding by H.265 (i.e. MPEG-H Part 2) and channel coding by LDPC at different code rates for encoding, transmitting and reconstructing other example sequences of correlated data items over a channel having AWGN at various signal to noise (SNR) ratios. [00165] As can be seen in the top pane of Figure 19, in terms of the PSNR metric, separation coding using H.265 and certain LDPC codings can outperform the performance envelope of the trained joint source and channel encoder and decoder as disclosed herein at higher PSNR values. However, a performance comparison is made using the more perceptually aligned MS- SSIM metric, as shown in the bottom pane of Figure 19, it can be seen that the performance envelope of the trained joint source and channel encoder and decoder as disclosed herein can outperform separation-based transmission with H.265. It should also be evident from Figure 19 that in the very low SNR regime (i.e., SNR_AWGN < −1 dB), H.265 was unable to meet the compression rate required, and therefore did not produce any results in that range. The trained joint source and channel encoder and decoder as disclosed herein on the other hand, did not have this problem, and results were produced even at low SNR. Further optimization of the network architecture of the trained joint source and channel encoder and decoder as disclosed herein should bring the PSNR performance on a par with or better than H.265 for higher SNR values as well. [00166] Figure 20 shows a visual comparison of reconstructed frames of an example video encoded and transmitted across a channel having additive white Gaussian noise at an SNR of -4dB, 3dB and 13dB (from left to right), by an example of the communication system of Figure 6 trained at different SNRs (top pane, i.e. SNR_Train = -1dB, 6dB, 13dB, from left to right) and by a separate source coding by H.264 (i.e. MPEG-4 Part 10) and channel coding by a low-density parity-check (LDPC) code using different channel coding schemes (bottom pane, i.e.1/2 BPSK, 3/4 QPSK, 3/416QAM, from left to right). [00167] As can be seen, at SNR_AWGN = 13 dB, the visual qualities of the videos produced by H.264 and the trained joint source and channel encoder and decoder as disclosed herein are similar. However, at SNR_AWGN = 3 dB, the video produced by H.264 starts to look very pixelated, while the trained joint source and channel encoder and decoder as disclosed herein is still able to retain a smooth looking frame. At SNR_AWGN = -4 dB, the capacity of the channel is too low for H.264 to compress the video sufficiently, therefore the output is simply black, while the trained joint source and channel encoder and decoder as disclosed herein is still able to achieve a reasonable video quality despite the very low channel SNR. [00168] In these tests, on average, in the AWGN case and a bandwidth compression rate of ρ = 0.031, the trained joint source and channel encoder and decoder as disclosed herein outperforms H.264 by 0.46 dB in PSNR and by 0.0081 in MS-SSIM for SNR_AWGN ∈ [13, 20] dB, by 3.07 dB in PSNR and by 0.0485 in MS-SSIM for SNR_AWGN ∈ [3, 6] dB. The trained joint source and channel encoder and decoder as disclosed herein falls short of H.265 by 3.22 dB in PSNR, but outperforms it by 0.0006 in MS-SSIM for SNR_AWGN ∈ [13, 20] dB. Similarly, it is 0.61 dB worse than H.265 in PSNR but outperforms it by 0.0069 in MS-SSIM for SNR_AWGN ∈ [3, 6] dB. [00169] With respect to complexity, using the NVIDIA TensorRT framework to optimize the inference time of the trained joint source and channel encoder and decoder as disclosed herein, it was found that the average inference time is approximately 26 ms. On the other hand, only the encoding time of H.264 took on average 24 ms, using hardware acceleration on the Intel i9- 9900K CPU. H.265 is even slower, at 92 ms. Therefore, the trained joint source and channel encoder and decoder as disclosed herein can be extremely efficient in practice using optimized hardware and library, more so than separation-based methods. [00170] Next, in Figure 21 a performance comparison is shown of a trained joint source and channel encoder and decoder as disclosed herein and the performance of separate source coding by H.264/H.265 and channel coding by a low-density parity-check (LDPC) 3/416QAM code for encoding, transmitting and reconstructing other example sequences of correlated data items over a channel having AWGN at a signal to noise (SNR) ratio of 20dB for different bandwidth compression rates ρ. By decreasing the bandwidth compression ratio ρ, the compression of the video is increased. To change ρ, the retraining of the joint source and channel encoder and decoder is not required. Rather, the bandwidth allocation module needs to be retrained with a different action set. As shown in Figure 21, we see that the joint source and channel encoder and decoder as disclosed herein beats H.264 with LDPC coding for all the bandwidth compression ratios tested in terms of both the PSNR and MS-SSIM metrics. It also beats H.265 using the MS-SSIM metric as shown in Fig.6b, although again, it falls short of H.265 in terms of the PSNR metric. [00171] Regarding the performance improvements that can be achieved by optimising the allocation of bandwidth to different frames in the Group of Pictures using the bandwidth allocation module and methods described herein, we refer to Figure 22. [00172] Figure 22 shows a performance comparison of the performance envelope of another example of the trained joint source and channel encoder and decoder as disclosed herein, showing the difference in performance of the system having a uniform bandwidth allocation to the frames in a group of pictures, a non-uniform bandwidth allocation in accordance with a pre- determined heuristic, and an optimal bandwidth allocation based on embodiments of the present disclosure in which a bandwidth allocation module is used. [00173] For comparison with the optimised bandwidth allocation, the results obtained by using the allocation network

is compared with that of uniform allocation (i.e. each frame having the same bandwidth allocation and with a heuristic bandwidth allocation policy. For the heuristic bandwidth allocation policy, 50% of the available bandwidth is allocated to the key frame and the remaining 50% of the available bandwidth is allocated to interpolation frames based on the magnitude of their optical flow (SSF) with respect to the reference frames. The intuition behind this heuristic policy is that the greater the magnitude of the optical flow, the more pixel warping is needed to interpolate the frame from its reference frames. Therefore, more bandwidth should be allocated to such frames. Further, since the reconstruction quality of the key frame affects the reconstruction quality of the remaining frames in the Group of Pictures, half of the bandwidth is allocated to it. In Figure 22, it can be seen that there is a clear and significant improvement in performance over both uniform and heuristic allocation when using the optimised bandwidth allocation provided by the trained allocation network

operated by the bandwidth allocation module as described herein. [00174] Overall, the optimised trained bandwidth allocation network improves upon the

uniform allocation policy by 0.35dB in PSNR and by 0.0025 in MS-SSIM, for ρ = 0.031. It also improves upon the heuristic allocation policy by 0.25 dB in PSNR and by 0.0015 in MS-SSIM, for the same ρ. [00175] Referring now to Figures 12 to 16, another detailed example embodiment of a communication system 100 for transmitting video across a communications channel will now be described. The communication system 100 in this embodiment operates a dynamic control module 115c allocating data items according to a dynamic decision agent, implemented as a Markov Decision Process (MDP). The communication system 100 in this embodiment determines interpolation information for encoding interpolation data items in a latent space of the encoder output vectors encoded by the key item encoder neural network 115k. Further still, the communication system 100 includes an interpolation encoder neural network 115i configured as a recurrent neural network (RNN), more specifically a Long Short-Term Memory (LSTM), in which the internal cell state is updated to recurrently encode successive interpolation items until the next key frame, which is then normalized to the power constraint and transmitted as the output vector for recurrent decoding at the decoder 135. In this way, a bandwidth allocation module 115b of the encoder 115 in this embodiment is implemented as the dynamic control module 115c for selecting whether or not successive items are for encoding together into the cell state of the interpolation encoder neural network 115i to share the bandwidth needed to transmit the encoder output vector between successive interpolation items. As well as providing a similar performance improvement compared to separate source and channel coding schemes as the previous embodiment, this arrangement provides enhanced capabilities for efficiently encoding streaming media, such as live video, for transmission in such a way that the reconstruction performance is maintained high across a wide range of different channel conditions but the bandwidth used can be low. While in this embodiment, the interpolation item encoder neural network 115i and interpolation item decoder neural network 135i are configured as RNNs, in particular LSTMs, these can be provided by any suitable function that can learn to recursively encode and decode successive items in a sequence into a state for transmission and recursive decoding, such as a transformer architecture. [00176] Thus, in this embodiment, theﬁxed GoP formulation, where a group of N frames are considered jointly for compression, are forgone, and instead the encoder control module 115c addresses the question of which items to allocate as key items and as interpolation items dynamically, using a dynamic decision agent. In particular, the dynamic decision agent is implemented in the embodiment as an inﬁnite horizon Markov decision process (MDP). [00177] That is, as shown in Figures 12 and 15, consider a sequence of video frames,

received in step 1501 of encoder process 1500 as a sequence of input data items from information source 111. The sequence of video frames,

may be a stream of video frames, for example being recorded live by a security camera or a drone. In step 1503 each frame x_t isﬁrst transformed into a latent space vector via the key item encoder neural

network 115k, again denoted as

[00178] The complimentary key item decoder neural network is also similarly defined to have an architecture similar to denoted as that performs the opposite

operation as

[00179] Then, at step 1504, the bandwidth allocation module 115b working with the dynamic decision agent implemented as an MDP by dynamic control module 115c, dynamically determines whether the data item should serve as a key item or an interpolation item. [00180] For the dynamic decision agent, the MDP state at time step t is defined as a tuple

where k is the number of frames since the last key frame. Then the MDP implements in the dynamic control module 115c an agent, who takes input at time step t the current state s_t and outputs a binary decision that states whether the current frame x_t should be a key frame or not. That is, let the agent be a function where 0 implies the current

frame is not a key frame and 1 implies that it is. While in this embodiment, the state s_t = (z_t , k) is considered for simplicity, this may not necessarily be the case, and s_t can in other embodiments include additional information such as motion information (e.g. opticalﬂow vectors) and residual information, as in the case in the embodiment described above. [00181] If the agent decides that x_t should be a key frame, then in step 1505 the code word c_t for transmission at time step t is taken to be equal to the latent code z_t output by the key item encoder neural network 115k. [00182] On the other hand, if the agent decides that x_t should not be a key frame, then in step 1506, a secondary encoder network is utilised as the interpolation item encoder neural network 115i. This neural network, and its counterpart interpolation item decoder neural network 135i, have an architecture and mode of operation that is different to the embodiment described above in Figures 6-11. Specifically, the interpolation item encoder neural network 115i in this embodiment is a recurrent neural network (RNN), more specifically a Long Short-Term Memory (LSTM), the architecture of which is shown in Figure 13, that takes at its input

layer a tuple and maps it to a code word c_t.

[00183] That is, for the encoder 115,

[00184] As will be explained in more detail below, in steps 1507 and 1508, the interpolation item encoder neural network 115i may encode into c_t multiple data items successively selected as interpolation items. [00185] Thus, theﬁnal set of codewords C_t is constructed from the sequence of keyframe decisions by the bandwidth allocation module 115b and dynamic

decision agent and codewords

such that

[00186] Thus, in the process 1500, if a new keyframe is initialised at time t + 1, then the codeword c_t isﬂushed and stored in the set of codewords C_t to be transmitted and the latent vector of the new keyframe z_t+1, is also appended in C_t+1 as it is a keyframe. This implies, if a frame is chosen to be a key frame, its codeword is independent of all other codewords; if a frame is not chosen to be a key frame, then its codeword is dependent on the previous codeword. This is similar to motion representation interpolation used in the embodiment described in relation to Figures 6-11 above, except rather than interpolating in the input data (i.e. pixel) domain, in this embodiment, interpolation between data items is now done implicitly by the interpolation item encoder neural network 115i in the latent space. The beneﬁt of

interpolation in the latent space is that the interpolation item encoder neural network 115i may be able to transform the values in the pixel domain to a latent space where interpolation can be done more compactly. For example, in motion representation interpolation used in the embodiment described in relation to Figures 6-11 above, optical flow interpolation, which occurs in the input space (pixel) domain, does not capture occlusion/ disocculusion. As a result, the residual needs to be computed and transmitted to account for this type of information. This is due to the fact that opticalﬂow treats each frame as a 2D plane where the pixels are simply moved to obtain subsequent frames, without accounting for the fact that the scene itself may be 3D and therefore objects can appear as a result of objects in front of it moving in front of the camera. [00187] In contrast, in the latent space interpolation approach used in the interpolation item encoder neural network 115i in the present embodiment, the function

can be thought

of as a mapping to a space with higher dimensions (greater degree of freedom) that describes the various types of motion information (opticalﬂow, residual), and due to the greater degree of freedom, the interpolation can be done by translating the values in each dimension. For example, one dimension can be describing the x-axis movement, another can describe the occlusion of objects... etc. [00188] As indicated above, in the present embodiment, if there are consecutive non-key frames assigned, the loop of step 1507 to pass the next interpolation data item x_t+1, to the key and then the interpolation item encoders to update internal state cell of the interpolation item encoder neural network, means that all those consecutive non-key frames will be represented by a single code word c_t. The interpolation item encoder neural network 115i can thus be

seen as a codeword updater, that takes in new information about the current frame through the latent vector z_t and updates the previous code word c_t-1 to obtain the new code word c_t. [00189] This is done using a recurrent neural network (RNN), such as a long-short term memory (LSTM) module as shown in Figure 13. That is, as shown in Figure 13, the internal state of the LSTM module (ℎ_^) represents the current codeword c_t, while the input to the LSTM at time t (i.e. x_t), is the latent vector of the current frame z_t. This architecture is only exemplary, and other neural network architectures can be used to perform the function of

[00190] To facilitate the decoding of the codewords by the decoder on receipt thereof, the encoder control module 115c may store a vector representing a map of key frame allocations as a binary vector describing which frames are key frames.

This information may be sent by the transmitter 110 as side information using conventional digital modulation and channel coding. [00191] If the set of codewords transmitted to the receiver 130 by steps 1505 and 1509 at time step t is

where

, is the number of codewords in the set, the bandwidth allocation module 115b may set a channel use constraint B, such that the average channel utilisation is below B. That is

[00192] The codeword set C_t may then be power normalised and transmitted by the carrier modulator 118 and antenna 119 across the channel 120. It should be noted that the process 1500, as set out more definitely in Algorithm 1 of Figure 14a, does not wait until a certain time t to send the codewords, but rather whenever a new codeword is appended to the set at time t, the new codeword is transmitted as soon as it becomes available. [00193] Turning now to the receiver process 1600 shown in Figure 16 and as set out more definitely in Algorithm 2 of Figure 14b, in step 1601 the receiver receives from the communications channel 120 and demodulates a set of noisy codewords

and the key frame allocation map m_t. [00194] Then, in steps 1602-1606, the decoder 130 follows a similar process to the encoder 110 for decoding. In this respect, if the key frame allocation map m_t, indicates that the noisy detected codeword

represents a key frame, in 1602 the decoder control module 135c passes the codeword to the key item decoder neural network 135k where in step 1603 it decodes and recovers a reconstruction

of the encoder input vector x_t to generate a representation of the input data item. [00195] On the other hand, if the key frame allocation map m_t, indicates that the noisy detected codeword

does not represent a key frame, in 1602 the decoder control module 135c passes the codeword

to the interpolation item decoder neural network 135i defined as a function which takes as an input a tuple

and in step 1605 decodes the noisy detected codeword by mapping it to an estimate of the frame latent vector

As can be

seen from the input, the mapping in step 1605 is based on received signal values in codeword and a latent representation of previous data item (i.e.

which is in this case generated by operating a version of the key item encoder neural network 115k function

stored locally at the decoder 130 (not shown in Figure 1, but this can be seen in the receiver 130 in Figure 12) on the reconstruction

of the encoder input vector x_t-1 at the previous time step (i.e. by performing at the decoder 135).

[00196] That is, the noisy codeword is processed by decoder 135 in steps 1602 and 1605 for a time step ^ as follows:

[00197] In step 1606, the estimate of the latent vector for time step t decoded by

interpolation item decoder neural network 135i in step 1605 is passed to the key item decoder neural network 135k (i.e. to step 1604) to decode the latent vector

to recover a reconstruction of the encoder input vector x_t interpolated from the reconstruction of the

previous encoder input vector x_t-1 to generate a representation of the input data item for provision to the information sink 131. [00198] The process performed by the decoder 135 using the key item decoder neural network 135k, interpolation item decoder neural network 135i, and the version of the key item encoder neural network 115k function

stored locally at the decoder 130, is set out in Algorithm 2 shown in Figure 14b. As can seen by Algorithm 2, the interpolation item decoder neural network 135i in essence provides a decoder process that recursively unpacks the codeword by

conditioning the current latent vector

from the previous time step t. In practice, the unpacking function is also performed by an LSTM module. The internal state ℎ_t represents the

current state of the unpacked codeword y_t and the input of the LSTM x_t represents the current latent vector This recursive encoding and decoding process for successive interpolation items is illustrated in Figure 12 for increasing time steps. [00199] The training of the key item encoder and decoder neural network pair, 115k and 135k, and the interpolation item encoder and decoder neural network pair, 115i and 135i, is such that the neural networks are trained together using the method generally described in relation to Figure 5 and the mean-squared error as the loss function, to optimise the weights of the hidden layers thereof to minimise a reconstruction error. Again, an AWGN noise model for the channel may be used. [00200] While the examples above indicate a software-driven implementation of components of the invention by a more general-purpose processor such as a CPU core based on program logic or instructions stored in a memory, in alternative embodiments, certain components of the invention may be partly embedded as pre-configured electronic systems or embedded controllers and circuits embodied as programmable logic devices, using, for example, application-specific integrated circuits (ASICs) or Field-programmable gate arrays (FPGAs), which may be partly configured by embedded software or firmware. [00201] It should be noted that, in accordance with the present disclosure, the communication channel 120 should be understood as any transformation from the channel input space to the channel output space

that includes a random transformation due to the channel. This may include additive noise, interference, or other stochastic properties of the channel that will randomly transform the transmitted signal, e.g., fading and multi-path effects in wireless channels. Thus the reference to the noise-affected version of the of the vector of signal

values z received at the decoder should be understood to indicate that the input

to the decoder is a vector of values correlated with the transmitted vector z of signal values (which is itself correlated with the input data x from the information source), transformed by the communication channel 120, whether that transformation is ‘noise’ or another channel transformation. [00202] In this respect, in accordance with the present disclosure, the communication channel 120 should be understood as encompassing any channel that applies a random transformation to the channel output space. [00203] Thus, although the examples described above disclose the transmitted vector z of signal values being raw signal values in IQ space, taking any value (although this may be constrained to correspond to a fixed, pre-defined constellation) transmittable by the transmitter over the communication channel 120, in other embodiments the communication channel 120 may be one that also includes an existing channel encoder and decoder scheme, in which the signal space of the channel input may be the predetermined finite set symbols of the channel code (which could be bit values) for modulating a signal carrier for providing input signals in the alphabet in the input signal space for the communication channel. Thus besides random noise applied by the communication channel, the transformation applied by the communication channel 120 may, in embodiments, also include an existing channel code. Thus in these embodiments, the encoder and decoder pairs may be configured to learn a mapping to a predefined alphabet of symbols corresponding to a message alphabet for an existing channel code by which the input signals to the given communications channel are modulated. In this case, the encoder output vectors z output from the encoder 115 will be mapped into the message alphabet of the corresponding channel code (rather than, for example, the raw IQ values transmittable by the transmitter over the communication channel 120, as in the embodiments described above). The noise-affected channel output ̂ input to the decoder

neural network 135 may correspond to the decoded message of the existing channel decoder. In this respect, in these embodiments, the encoder neural network 115 and decoder neural network 135 may learn an optimum mapping of the input information source 111 to inputs of an existing channel code of the communications channel 120 that reduces reconstruction errors at the output 131 of the decoder neural network135. Although acting as an outer code in these embodiments, this learned coding of the encoder neural network 115 and decoder neural network 135 is still optimised end-to-end based on the characteristics of the communication channel 120 to reduce reconstruction errors, even though in these alternative embodiments the communication channel 120 includes an existing channel code. This is unlike existing modular source codes which are defined independently of the random transformation applied by any channel. [00204] Throughout the description and claims of this specification, the words “comprise” and “contain” and variations of them mean “including but not limited to”, and they are not intended to (and do not) exclude other components, integers or steps. Throughout the description and claims of this specification, the singular encompasses the plural unless the context otherwise requires. In particular, where the indefinite article is used, the specification is to be understood as contemplating plurality as well as singularity, unless the context requires otherwise. [00205] Features, integers, characteristics or groups described in conjunction with a particular aspect, embodiment or example of the invention are to be understood to be applicable to any other aspect, embodiment or example described herein unless incompatible therewith. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive. The invention is not restricted to the details of any foregoing embodiments. The invention extends to any novel one, or any novel combination, of the features disclosed in this specification (including any accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed. In particular, any dependent claims may be combined with any of the independent claims and any of the other dependent claims.

Claims

CLAIMS 1. An encoder for use in a transmitter of a communications system for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding, the encoder comprising: a key item encoder neural network for encoding data items selected from the sequence to serve as a key item that can be directly reconstructed by the decoder, the key item encoding being based on the input data item and being independent of any other data item in the sequence; an interpolation item encoder neural network for encoding data items selected from the sequence to serve as an interpolation item that can be reconstructed by the decoder using interpolation, the interpolation item encoding using data representing the input data item and at least one previous data item in the sequence; the key item encoder neural network and interpolation item encoder neural network having hidden layers of connecting nodes with weights that, in use, maps vectors of values received at nodes in an encoder input layer thereof to encoder output vectors provided at nodes of an encoder output layer thereof, the encoder output vectors being used for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the data items across a communication channel; the key item encoder neural network and interpolation item encoder neural network having in the communications system a respective complementary key item decoder neural network and interpolation item decoder neural network for receiving a noise-affected version of the encoder output vector from a receiver receiving and demodulating the signal transmitted across the communication channel and reconstructing the input vector to generate a representation of the input data item; wherein the connecting node weights of the key item encoder neural network and interpolation item encoder neural network have been trained together with the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input- output pairs of training data items.

2. A decoder for use in a receiver of a communications system for conveying sequences of correlated data items from an information source across a communications channel using joint source and channel coding, the decoder comprising: a key item decoder neural network for decoding data items from the sequence indicated as key items, the key item decoding being based on a noise-affected version of an encoder output vector generated at a transmitter by a complementary key item encoder neural network to encode the data item based on the input data item independent of any other data item in the sequence, the key item decoder neural network being configured for reconstructing the input vector to generate a representation of the input data item directly from the noise-affected version of the encoder input vector and independently of any other data item in the sequence; an interpolation item decoder neural network for decoding data items indicated as interpolation items, the key item decoding being based on data representing at least one previous data item in the sequence and a noise-affected version of an encoder output vector generated at a transmitter by a complementary interpolation item encoder neural network to encode the data item based on data representing the input data item and at least one previous data item in the sequence, the noise-affected version of the encoder output vector having been received and demodulated at the receiver based on the signal transmitted across the communication channel, the interpolation item decoder neural network being configured for reconstructing the input vector by interpolation from reconstructions of representations of other data items in the sequence to generate a representation of the input data item; the key item decoder neural network and interpolation item decoder neural network having hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in a decoder input layer thereof to decoder output vectors provided at nodes of an decoder output layer thereof, the decoder output vectors providing a reconstruction of the encoder input vector to generate a representation of the input data item, wherein the connecting node weights of the key item decoder neural network and interpolation item decoder neural network have been trained together with the respective complementary key item encoder neural network and interpolation item encoder neural network, to minimise an objective function characterising a reconstruction error between input- output pairs of training data.

3. A communication system for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding, comprising: a transmitter including an encoder as claimed in claim 1, the transmitter being configured for transmitting signals over a communication channel based on signal values of encoder output vectors of the key item encoder neural network and interpolation item encoder neural network provided by the encoder; a receiver including a decoder as claimed in claim 2, the receiver being configured for receiving and demodulating a noise-affected version of the encoder output vectors based on the signal transmitted across the communication channel and passing them to the decoder for reconstructing the sequence of correlated data items.

4. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the sequences of correlated data items are a series of image frames providing a video.

5. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the encoder is configured to create a coded and compressed representation of the input data as an encoder output vector for transmission across the communications channel; and wherein the decoder is configured to decode the noise-affected version of the encoder output vector back into an uncompressed reconstruction of the input data.

6. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the interpolation encoder input layer is configured such that the interpolation item encoding uses data received at the input layer thereof representing the input data item in pixel space and at least one previous data item in the sequence in pixel space.

7. An encoder, decoder or communication system, as claimed in claim 6, wherein the data representing at least one previous data item in the sequence used by the interpolation item encoder to encode interpolation items comprises: the motion representation information representing the relative motion between the data item and at least one other data item in the sequence; and the residual information between the data item and a motion compensated version of at least one other data item in the sequence using the motion representation information in respect of that data item.

8. An encoder, decoder or communication system, as claimed in any of claims 1 to 5, wherein the interpolation encoder input layer is configured such that the interpolation item encoding uses data received at the input layer thereof representing: the input data item in the latent space defined by the output of the key item encoder neural network; and at least one previous data item in the sequence in the latent space defined by the output of the key item encoder neural network.

9. An encoder, decoder or communication system, as claimed in claim 8, wherein the data representing the input data item used by the interpolation item encoder to encode interpolation items comprises: the key item encoder output vector encoded for the data item by the key item encoder neural network; and wherein the data representing at least one previous data item in the sequence used by the interpolation item encoder to encode interpolation items comprises an encoder output vector transmitted by the encoder for at least one previous data item in the sequence.

10. An encoder, decoder or communication system, as claimed in claim 9, wherein the data representing at least one previous data item in the sequence used by the interpolation item decoder to decode interpolation items comprises a noise-affected version of an encoder output vector or a reconstruction of the encoder input vector providing a representation of the input data item for at least one previous data item in the sequence.

11. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the data representing at least one previous data item in the sequence used by the interpolation item encoder to encode interpolation items further comprises data representing at least one subsequent data item in the sequence

12. An encoder, decoder or communication system, as claimed in any preceding claim, the encoder and decoder further comprising a static control module configured to select data items from the sequence of data items as key items and interpolation items according to a fixed order specified by a predetermined group of items, the static control module being further configured to use the key item encoder and decoder to encode and decode data items selected as key items, and to use the interpolation item encoder and decoder to encode and decode data items selected as interpolation items.

13. An encoder, decoder or communication system, as claimed in any preceding claim, the encoder further comprising a dynamic control module having a dynamic decision agent configured to dynamically choose whether the data item is to serve as a key item or an interpolation item.

14. An encoder, decoder or communication system, as claimed in claim 13, wherein the dynamic decision agent is configured to dynamically choose whether the data item is to serve as a key item or an interpolation item based at least on one or more of: the current data item; the number of data items transmitted since last key item; a current average channel utilisation; and a channel utilisation constraint.

15. An encoder, decoder or communication system, as claimed in claim 13 or 14, wherein the dynamic decision agent is configured to dynamically choose whether the data item is to serve as a key item or an interpolation item so that the average channel utilisation is below the channel utilisation constraint.

16. An encoder, decoder or communication system, as claimed in claim 13, 14 or 15, wherein the dynamic control module is configured to: select, based on a decision output by decision agent for the data item, whether the data item is to serve as a key item or an interpolation item; if the data item is selected to serve as a key item, use the key item encoder to encode the data item in the sequence to provide a key item encoder output vector for the item, the encoder being configured for transmitting the key item encoder output vector on the communications channel.

17. An encoder, decoder or communication system, as claimed in claim 16, the dynamic control module being further configured to: if the data item is selected to serve as an interpolation item, use the interpolation encoder to encode the data item to provide an interpolation item encoder output vector for the item, the encoder being configured for transmitting the interpolation item encoder output vector on the communications channel.

18. An encoder, decoder or communication system, as claimed in any of claims 13 to 17, the dynamic decision agent configured to generate data mapping for the sequence of data items, which data items are key data items and which data items are interpolation data items, for transmission across the communications channel and for use by the decoder to determine whether the received noise-affected version of an encoder output vector should be decoded by the key item decoder neural network or the interpolation item decoder neural network.

19. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the encoder output layers of the interpolation item encoder neural network and the decoder input layers of the interpolation item decoder neural network are divided into ordered blocks, and wherein the neural networks are trained such that the interpolation item encoder neural network encodes descending ordering of information in increasing blocks of nodes, and such that, for interpolation items, the decoder reconstructs an increasingly refined representation of the input data with increasing blocks received in the noise-affected version of an encoder output vector.

20. An encoder, decoder or communication system, as claimed in claim 19, wherein the communication system further comprises a bandwidth allocation module configured to determine, for each data item in the sequence selected to serve as an interpolation item, a number of blocks of the interpolation encoder output layer to be transmitted over the communication channel, so as to allocate the available bandwidth in the communications channel to the transmission of interpolation items.

21. An encoder, decoder or communication system, as claimed in claim 20, wherein the bandwidth allocation module is further configured to determine the number of blocks of the interpolation encoder output layer to be transmitted over the communication channel to seek to minimise the reconstruction error between the representation of the input data reconstructed at the decoder and the input data encoded at the encoder.

22. An encoder, decoder or communication system, as claimed in claim 20 or 21, wherein the bandwidth allocation module is configured to determine the number of blocks of the interpolation encoder output layer to be transmitted over the communication channel based on at least motion representation information determined to represent the relative motion between the data item and at least one other data item in the sequence.

23. An encoder, decoder or communication system, as claimed in claim 20, 21 or 22, wherein the bandwidth allocation module is configured to determine a number of blocks of the interpolation encoder output layer to be transmitted over the communication channel, so as to seek to optimally allocate the available bandwidth in the communications channel between a group, a set or the whole sequence of data items to be transmitted.

24. An encoder, decoder or communication system, as claimed in any of claims 1 to 19, wherein the interpolation encoder neural network is configured to: maintain and update an internal state as successive interpolation items of a group of consecutive interpolation items are encoded by the interpolation encoder neural network; and after successive interpolation items have been encoded into the internal state, to provide the internal state as the interpolation encoder output vector for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the group of consecutive interpolation data items across the communication channel.

25. An encoder, decoder or communication system, as claimed in claim 24, wherein the encoder neural network is configured to output an encoder output vector for transmission for each key item and each group of consecutive interpolation items between key items.

26. An encoder, decoder or communication system, as claimed in claim 24 or 25, wherein the interpolation decoder neural network is configured to: for a group of consecutive interpolation items, recursively decode the noise-affected version of the encoder output vector received from a receiver to thereby reconstruct the encoder input vectors of successive interpolation items to generate a representation of the input data items of the group of consecutive interpolation items.

27. An encoder, decoder or communication system, as claimed in claim 24, 25 or 26, wherein the interpolation encoder neural network and the interpolation decoder neural network are both provided by a recurrent neural network, optionally a Long Short-Term Memory.

28. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the encoder output vectors provide values in a signal space that represent in-phase and quadrature components for modulation of the carrier signal for transmission over the communication channel.

29. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the encoder output vectors provide values defining a probability distribution sampleable to provide values in a signal space that represent in-phase and quadrature components for modulation of the carrier signal for transmission over the communication channel.

30. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the encoder output vectors provide values corresponding to a predetermined finite set of symbols of an existing channel encoder and decoder scheme for transmission of data over the communication channel.

31. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the encoder is configured to encode the sequences of correlated data items from an information source for transmission across the communications channel as a streaming media.

32. An encoder, decoder or communication system, as claimed in any of claims 1 to 30, wherein the encoder is configured to encode the sequences of correlated data items from an information source into a static media file.

33. An encoder, decoder or communication system, as claimed in any preceding claim, wherein the correlated data is a video and wherein the items are video frames.

34. An encoder, decoder or communication system, as claimed in claim 33, wherein the correlated data items are each represented by a 3D matrix with a depth based on the colour channels, a height based on the height of the frame and a width based on the width of the frame.

35. An encoder, decoder or communication system, as claimed in claim 33 or 34, wherein the encoder input layers of the key item encoder neural network and/or the interpolation item encoder neural network are configured to receive video frames as input vectors.

36. An encoder, decoder or communication system, as claimed in claim 35, wherein the quantity of information included in the output vector of the key item encoder neural network and/or the interpolation item encoder neural network is smaller than the input vector.

37. A method for conveying sequences of correlated data items from an information source across a communications channel using joint source and channel coding, the method comprising, at a transmitter, for each data item in the sequence: selecting data items from the sequence of data items to serve as key items and interpolation items; encoding data items to serve as key items using a key item encoder neural network, the key item encoding being based on the input data item and being independent of any other data item in the sequence; encoding data items to serve as interpolation items using an interpolation item encoder neural network, the interpolation item encoding using data representing the input data item and at least one previous data item in the sequence; the key item encoder neural network and interpolation item encoder neural network having hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in an encoder input layer thereof to encoder output vectors provided at nodes of an encoder output layer thereof, the encoder output vectors being used for providing values in a signal space for modulating a carrier signal for transmission of a transformed version of the data items across a communication channel; and transmitting signals over a communication channel based on signal values of encoder output vectors of the key item encoder neural network and interpolation item encoder neural network; the method further comprising, at a receiver: receiving and demodulating a noise-affected version of the encoder output vectors based on the signal transmitted across the communication channel; decoding data items from the sequence indicated as key items using a key item decoder neural network based on a noise-affected version of the encoder output vector for the data item, the key item decoder neural network being configured for reconstructing the input vector to generate a representation of the input data item directly from the noise-affected version of the encoder input vector and independently of any other data item in the sequence; and decoding data items from the sequence indicated as interpolation items using an interpolation item decoder neural network based on data representing at least one previous data item in the sequence and the noise-affected version of an encoder output vector for the data item, the key item decoder neural network being configured for reconstructing the input vector by interpolation from reconstructions of representations of other data items in the sequence to generate a representation of the input data item; the key item decoder neural network and interpolation item decoder neural network having hidden layers of connecting nodes with weights that, in use, map vectors of values received at nodes in a decoder input layer thereof to decoder output vectors provided at nodes of an decoder output layer thereof, the decoder output vectors providing a reconstruction of the encoder input vector to generate a representation of the input data item, wherein the connecting node weights of the key item encoder neural network and interpolation item encoder neural network have been trained together with the respective complementary key item decoder neural network and interpolation item decoder neural network, to minimise an objective function characterising a reconstruction error between input- output pairs of training data items.

38. A method of training an encoder and a decoder for use in a communication system as claimed in any of claims 1 to 36 for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding, comprising: for input-output pairs of a set of training data items from the information source passed to the encoder, determining an objective function characterising a reconstruction error between input-output pairs of training data from the information source passed to the encoder and the representation of the input data reconstructed at the decoder; and using an appropriate optimisation algorithm operating on the objective function, updating the connecting node weights of the hidden layers of the key item encoder neural network, interpolation item encoder neural network, key item decoder neural network and interpolation item decoder neural network to seek to minimise the objective function.

39. A method as claimed in claim 38, wherein the encoder neural networks and decoder neural networks have been trained together using training data in which a model of the communication channel is used to estimate channel noise and add it to the transmitted signal values to generate a noise-affected version of the vector of signal values in the input-output pairs of training data.

40. A method as claimed in claim 38 or 39, wherein the encoder output layers of the interpolation item encoder neural network and the decoder input layers of the interpolation item decoder neural network are divided into ordered blocks, wherein the encoder output vector passed to the decoder input layer for each input-output pair during training is selected to have a random number of the ordered blocks, such that, following training, the interpolation item encoder neural network encodes descending ordering of information in increasing blocks of nodes, and such that, for interpolation items, the decoder reconstructs an increasingly refined representation of the input data with increasing blocks received in the noise-affected version of an encoder output vector.

41. Computer readable medium comprising one or more instructions which when executed cause at least one of: a transmitter; and a receiver; to operate in accordance with the method of claim 37.

42. Computer readable medium comprising one or more instructions which when executed cause a computing device to operate a method as claimed in claim 37, 38 or 39 of training an encoder and a decoder for use in a communication system as claimed in any of claims 1 to 36 for conveying sequences of correlated data items from an information source across a communication channel using joint source and channel coding.