[go: up one dir, main page]

WO2010014239A2 - Staggercasting with hierarchical coding information - Google Patents

Staggercasting with hierarchical coding information Download PDF

Info

Publication number
WO2010014239A2
WO2010014239A2 PCT/US2009/004406 US2009004406W WO2010014239A2 WO 2010014239 A2 WO2010014239 A2 WO 2010014239A2 US 2009004406 W US2009004406 W US 2009004406W WO 2010014239 A2 WO2010014239 A2 WO 2010014239A2
Authority
WO
WIPO (PCT)
Prior art keywords
stream
data units
encoded data
frames
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2009/004406
Other languages
French (fr)
Other versions
WO2010014239A3 (en
Inventor
Avinash Sridhar
David Anthony Campana
Zhenyu Wu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Thomson Licensing SAS
Original Assignee
Thomson Licensing SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing SAS filed Critical Thomson Licensing SAS
Publication of WO2010014239A2 publication Critical patent/WO2010014239A2/en
Publication of WO2010014239A3 publication Critical patent/WO2010014239A3/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8451Structuring of content, e.g. decomposing content into time segments using Advanced Video Coding [AVC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/157Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
    • H04N19/159Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/31Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/89Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder
    • H04N19/895Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving methods or arrangements for detection of transmission errors at the decoder in combination with error concealment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234381Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2383Channel coding or modulation of digital bit-stream, e.g. QPSK modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26275Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for distributing content or additional data in a staggered manner, e.g. repeating movies on different channels in a time-staggered manner in a near video on demand system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
    • H04N21/4382Demodulation or channel decoding, e.g. QPSK demodulation

Definitions

  • the present invention generally relates to data communications systems, and more particularly to the transmission of video data with time diversity.
  • Staggercasting offers a method of protection against signal loss by transmitting a secondary, redundant stream that is time-shifted with respect to a primary stream. This allows a receiver to pre-buffer packets of the secondary stream to replace packets of the primary stream lost in transmission.
  • Various staggercasting techniques exist that differ in the types of redundant data sent in the secondary stream. For example, the secondary stream may simply be an exact copy of the primary stream staggered with some time offset. Such an arrangement, however, can be inefficient as it effectively doubles the bandwidth required by the staggercast transmission.
  • Another staggercasting technique involves the transmission of a secondary stream that is separately encoded from the primary stream.
  • this secondary stream is completely independent from the primary stream and is simply a separately encoded stream representing the same source video.
  • video decoders typically must maintain state data, such as previously decoded reference frames needed for decoding future frames, such a staggercasting arrangement requires a receiver to maintain two separate decoder states for each of the streams, placing additional memory burdens on the receiver.
  • staggercasting and various coding techniques are combined to transmit a secondary coded video stream in addition to a primary coded video stream such that the secondary stream contains a subset of video frames transmitted in the primary stream.
  • the subset of frames conveyed in the secondary stream are selected in accordance with their relative importance to other frames as determined by the coding technique by which they were encoded. More important frames are thus conveyed in the primary and secondary streams, whereas less important frames are conveyed only in the primary stream.
  • staggercasting and hierarchical predictive coding techniques are combined to transmit a secondary coded video stream in addition to a primary coded video stream such that the secondary stream contains a subset of video frames transmitted in the primary stream.
  • the subset of frames conveyed in the secondary stream are selected in accordance with their relative importance to other frames as determined by the hierarchical predictive coding technique by which they were encoded.
  • Frames used in decoding other frames are transmitted in both the primary and secondary streams, whereas frames not used in decoding other frames are transmitted only in the primary stream.
  • FIG. 1 is a block diagram of an exemplary staggercasting arrangement in which the present invention can be implemented;
  • FIG. 2 shows a hierarchical bipredictive (B) frame structure for temporal scalable video coding;
  • FIG. 3 shows an illustrative scenario in which a B frame sent redundantly in a staggercast stream is used to re-create frames lost in transmission in accordance with an embodiment of the invention.
  • 8-VSB eight-level vestigial sideband
  • QAM Quadrature Amplitude Modulation
  • RP radio-frequency
  • IP Internet Protocol
  • RTP Real-time Transport Protocol
  • RTCP RTP Control Protocol
  • UDP User Datagram Protocol
  • FIG. 1 is a block diagram of an illustrative staggercasting environment 100 comprising a stagger transmitter 15; a communications network 20, which may include a variety of elements (e.g., networking, routing, switching, transport) operating over various media (e.g., wireline, optical, wireless); and a stagger receiver 25.
  • a source such as a video encoder 10, provides an original stream 12 of encoded data units to the stagger transmitter 15, which, in turn, sends out a staggercast transmission for transmission over the communications network 20 for reception by the stagger receiver 25.
  • An additional stream 13 may be included by which the encoder 10 communicates coding information to the stagger transmitter 15, as described in greater detail below.
  • the staggercast transmission from the transmitter 15 comprises two streams.
  • the secondary stream 17 can be time-shifted or staggered relative to the primary stream 16, in which case it may also be referred to as a "staggered" stream.
  • Corresponding primary and secondary streams 21 and 22, respectively, are received by the stagger receiver 25. Staggering allows the receiver 25 to pre-buffer data units of the secondary stream 22 so that they may replace corresponding data units in the primary stream 21 that may have been lost or corrupted in transmission.
  • the primary and secondary streams 16, 17 may be combined into a single stream by a multiplexer or the like (not shown) before being provided to the network 20, conveyed as a single stream by the network 20, and de-multiplexed into streams 21 and 22 before being provided to the receiver 25.
  • the primary and secondary streams can transmitted, conveyed and received as separate streams.
  • the present invention is not limited to any specific implementation in this regard.
  • the stagger receiver 25 is coupled to a client, such as a video decoder 30 for decoding the received video data.
  • the decoder 30 provides a stream 35 of decoded pictures for display by a display device 40.
  • the contents of the secondary stream 17 output from the stagger transmitter 15 are a subset of the contents of the primary stream 16.
  • This provides a more efficient use of bandwidth over an arrangement in which the secondary stream 17 is a fully redundant stream; i.e., a complete copy of the primary stream 16.
  • the subset of data that is conveyed in secondary stream 17 is selected in accordance with the coding scheme used to encode the data conveyed.
  • One such scheme entails temporal scalable coding as described, for example, in H. Schwarz et al., "Analysis Of Hierarchical B Pictures and MCTF," ICME 2006 (hereinafter "Schwarz et al.”).
  • FIG. 2 shows a hierarchical bipredictive (B) frame structure for temporal scalable video coding, as described in Schwarz et al. hi the structure depicted, frames are organized into groups of pictures (GOPs), each with eight frames.
  • the last frame in each GOP also known as the key frame, can be an intra-coded (I) or a predictive (P) frame.
  • the other seven frames are B frames.
  • the subscript of each frame label indicates the frame's level in the frame structure hierarchy, with lower subscripts indicating greater importance.
  • FIG. 2 also indicates the orders in which the frames are coded and displayed.
  • the order of decoding is the same as the display order.
  • the order of transmission can be the same as the coding or the display order.
  • each GOP has one Bi frame, whose successful decoding depends on the key frame (Io/Po) of that GOP and the key frame of the previous GOP.
  • the aforementioned key frames are reference frames for the Bi frame.
  • the Bi frame is the second frame in the GOP to be coded, after the key frame, and the fourth frame to be decoded and displayed.
  • the key frames can be thought of as the base layer and the Bi frames as the first enhancement layer of a temporally scalable SVC stream.
  • Each GOP also includes two B 2 frames, the first of which depends on the Bi frame and the key frame of the previous GOP, and is the third frame to be coded and the second to be decoded and displayed.
  • the second B 2 frame depends on the Bj frame and the key frame of the current GOP, and is the sixth frame in the GOP to be coded and the sixth frame to be decoded and displayed.
  • each GOP includes four B 3 frames, each of which is dependent on an adjacent B 2 frame and a B 1 frame or a key frame, as shown in FIG. 2.
  • key frames are transmitted in the primary stream 16 as well as in the secondary stream 17 to protect against their loss.
  • Bi frames also be sent in both the primary and secondary streams 16 and 17.
  • the determination of whether to include a frame in the secondary stream 17 can be based on whether or not it is a reference frame, a frame on which the decoding of other frames relies.
  • B 2 frames are also sent in the secondary stream 17, but B 3 frames are not. Doing so provides improved picture quality with a small increase in required bandwidth since there are only two B 2 frames per GOP.
  • the hierarchical coding scheme illustrated in FIG. 2 is only one of a variety of different coding schemes that can be used with embodiments of the invention.
  • an encoder may use a coding scheme in which reference frames are generated with greater or lesser frequency than in the scheme depicted. For instance, every other frame in stream 12 can be a reference frame.
  • reference frames can occur regularly (e.g., every Nth frame), or at varying intervals, and with different patterns.
  • the coding scheme used by the encoder 10 is preferably selected with bandwidth efficiency in mind so as to allow the stagger transmitter 15 to select those frames for inclusion in the secondary stream which will provide the greatest value in terms of recreating lost frames in light of the additional bandwidth required to include those frames in the secondary stream.
  • bandwidth availability information can be fed-back to the encoder which can accordingly change the coding scheme that it uses in order to optimize bandwidth efficiency.
  • the determination of which frames to include in the secondary stream 17 is made by the stagger transmitter 15.
  • the decision to include a frame in the secondary stream 17 will depend on the characteristics (e.g., frame type, priority level) of the frame and/or available bandwidth.
  • the stagger transmitter 15 can determine the characteristics of each frame that it receives from the source 10 in a number of different ways.
  • the source 10 communicates frame characteristics and/or coding scheme information to downstream devices such as the stagger transmitter 15.
  • Such information can be sent in-band, via stream 12 in the form of additional packets or header information added to encoded data units, or out-of-band, via a separate stream 13 in one or more packets.
  • Coding scheme information may include a variety of information about the coding scheme used so as to enable a downstream device such as the stagger transmitter 15 to determine frame characteristics. Such information may include, for example, detailed information about a segment of video data explicitly indicating the type of each frame in the segment, or it may include a few key parameters of the coding scheme used to encode the video segment (e.g., GOP size, frame structure), which devices such as the stagger transmitter 15 can use to infer frame types.
  • the coding scheme information may be sent in the in the form of a file conveyed as payload by one or more packets, or in packet headers.
  • the stagger transmitter 15 decodes and/or parses the headers of packets in the stream 12, typically organized as Network Abstraction Layer (NAL) units, for information indicative of one or more characteristics of each frame received from the source 10.
  • NAL Network Abstraction Layer
  • NAL units with a NRI value of OO' are not used to reconstruct reference pictures for future prediction, in which case they can be lost or discarded without risking the integrity of the reference pictures in the same layer.
  • a NRI value greater than OO' indicates that the decoding of the NAL unit is required to maintain the integrity of reference pictures in the same layer, or that the NAL unit contains parameter sets. If it is determined that a frame is a reference frame (i.e., NRI > OO'), and thus should be protected, the stagger transmitter 15 can decide to include the frame in the secondary stream 17, assuming there is available bandwidth to do so.
  • Another field in the SVC NAL unit header that can be used to determine whether a NAL unit should be included in the secondary stream 17 is the six-bit priority id (PRID) field. A lower PRID value indicates a higher priority.
  • the stagger transmitter 15 can select NAL units for inclusion in the secondary stream 17 based on PRID so that, for example, NAL units with a PRID value less than a threshold value will be included in the secondary stream.
  • frame characteristic information can be conveyed using Quality of Service (QOS) or Type of Service (TOS) information (referred to herein collectively as "type-of-service" information) contained in the stream 12.
  • QOS Quality of Service
  • TOS Type of Service
  • the source 10 sets type-of-service bits in the headers of packets that it forwards to downstream devices such as the stagger transmitter 15.
  • the type-of-service bits of each packet are set in accordance with the frame information contained in the packet.
  • the stagger transmitter 15 parses the type-of-service information in the headers of encoded data units in stream 12 to determine the type of frame (e.g., key frame) being conveyed.
  • the frame characteristics can be determined by the stagger transmitter 15 for all frames communicated from the source 10 or a subset thereof. For example, if only key frames are to be contained in the secondary stream 17, the stagger transmitter 15 need only determine whether a frame in stream 12 is a key frame or not in deciding whether to include the frame in the secondary stream. However, even if additional frames are to be included in the secondary stream 12, such as Bi frames in the above example, the stagger transmitter can infer the positions of such frames in the stream 12 knowing the positions of the key frames. This saves the processing overhead that would otherwise be required to parse header information to identify such frames as well.
  • the determination of whether to include frames in a stagger stream can be made by other components in a staggercasting environment as well.
  • a multiplexer in network 20 receiving the primary 16 and secondary 17 streams can identify frames, using one of the above-described techniques, and decide whether to drop or add frames from the secondary stream 17 to the multiplexer output based on frame type and/or bandwidth availability.
  • the determination of which frames to include in the primary and secondary streams may also be done upstream, by the source 10.
  • FIG. 3 shows an illustrative scenario in which a Bi frame sent redundantly in a staggercast stream is used to re-create frames lost in transmission in accordance with an embodiment of the invention.
  • the lost frames include the Bi frame of GOP(N+ 1), in addition to the two B 3 frames and the two B 2 frames transmitted before and after the Bi frame.
  • the secondary stream 22 contains copies of the key frame and the Bi frame of each GOP, designated I'/P' and Bi' respectively. In this scenario, the secondary stream is received without error.
  • the offset between the two streams 21, 22 is shown as four data units; i.e., the secondary stream 17 is transmitted four data units earlier than the primary stream 16.
  • all frames are shown in FIG. 3 to have the same transmission time.
  • the size of a coded frame will vary substantially from frame to frame and thus so will the transmission time of each frame.
  • the stagger offset is typically expressed in terms of time rather than frames; e.g., the secondary stream frames may be transmitted four seconds earlier than their primary stream equivalents.
  • the invention is not limited to any specific time offset. The preferred time offset for a given implementation will depend on implementation- specific details such as, for example, the amount of memory at the receiver available for buffering and error or loss characteristics.
  • the secondary stream can be staggered later in time from the primary stream.
  • the secondary stream should preferably precede the primary stream.
  • transmitting the secondary stream later in time from the primary stream would result in the protection coming some time after a data loss. Either at initial playback or upon the first loss event, the primary stream would have to pause to wait for the replacement data units from the stagger stream to arrive, resulting in a diminished viewer experience.
  • the receiver can immediately begin playback of the primary stream while buffering the secondary stream to protect against future loss.
  • the primary and secondary streams may be provided with error protection (e.g., turbo coding, forward error correction, etc.) Both or only the secondary stream may be provided with error protection.
  • the two streams may also be provided with different levels of error protection, with the secondary stream preferably being provided with a higher level of protection. It would be possible to reduce the overhead of an error protection scheme by applying it only to the secondary stream. This also offers the advantage of allowing the receiver to immediately decode and play the unprotected primary stream. Since the secondary stream is preferably received before the primary stream, there should be sufficient time to correct errors in any secondary stream data units before they may be needed to replace any lost primary stream data units. [0040] As illustrated in FIG.
  • the lost B 1 frame of GOP(N+ 1) is replaced in the decoder output stream 35 by its copy Bi' received in the secondary stream 22. Additionally, the two B 2 frames and two B 3 frames of GOP(N+ 1) that were lost in the primary stream 21 are re-created by the decoder 30 using the frame Bi' (of the same GOP) received on the secondary stream 22.
  • the re-created frames are designated B 2 * and B 3 * in FIG. 3. To the extent that they would be relevant to the re-creation of the missing frames, other frames received successfully in the primary or secondary stream would also be used in the re-creation. For instance, in the scenario illustrated, the key frames of GOP(N) and GOP(N+ 1) are used in the re-creation of the B 2 * frames.
  • the missing B 2 and B 3 frames can be replaced with the B 1 ' frame, or they can be estimated by applying some form of interpolation or the like.
  • the present invention is not limited to any one particular recreation method in this regard.
  • the principles of the present invention can be applied to any coding scheme which includes frames that can be lost and re-created from other frames without unduly compromising video quality.
  • the coding scheme is a hierarchical predictive or P-frame scheme in which each GOP comprises a key frame, one or more P frames and/or B frames.
  • the key and P frames of each GOP are transmitted in both the primary and secondary streams 16, 17.
  • the source 10 provides a single stream 12 which is re-transmitted by the transmitter 15 as part of a staggercast transmission of two streams 16, 17.
  • This is only one of a variety of possible arrangements to which the principles of the present invention can be applied.
  • an arrangement in which the source 10 generates a staggercast transmission (with two streams) which is then received and re-transmitted by one or more staggercast transceivers could also be used with the present invention.
  • a variety of combinations of the source 10, stagger transmitter 15 and other elements such as a multiplexer are contemplated by the present invention.
  • Embodiments of the present invention enjoy several advantages over known approaches.
  • one staggercasting method involves the transmission of a secondary stream that is separately encoded from the primary stream.
  • this secondary stream is completely independent from the primary stream and is simply a separately encoded stream representing the same source video.
  • Typical video decoders must maintain state data, such as previously decoded reference frames that must be available for decoding future frames that are predicted from them.
  • a receiver would need to maintain two separate decoder states for each of the streams, placing additional memory burdens on the receiver.
  • the exemplary arrangement of the present invention described above can be implemented with only one decoder and associated state memory given that the two streams are related; i.e., the secondary stream is a subset of the primary stream.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

In the transmission of streams of data, such as coded video, staggercasting, in which a primary and a secondary stream are transmitted at some relative time offset (i.e., "staggered"), allows a receiver to pre-buffer frames of the secondary stream to replace frames of the primary stream that may have been lost in transmission. In an illustrative implementation, staggercasting is performed in which the secondary stream contains a subset of the coded video frames transmitted in the primary stream. The primary stream contains non-disposable frames, which are essential to properly decoding the video data, as well as disposable frames which are not. The secondary stream, however, contains copies of the non-disposable frames and may contain copies of some of the disposable frames or no disposable frames at all. When frames in the primary stream are lost, such an arrangement will allow reconstruction at the receiver of a high quality video stream using the frames in the secondary stream. The determination of which frames to include in the secondary stream will depend on their importance as determined by the coding scheme with which they were generated.

Description

STAGGERCASTING WITH HIERARCHICAL CODING INFORMATION
Related Patent Applications
[001] This application claims the benefit under 35 U.S.C. § 119(e) of United States Provisional Application No. 61/083,968, filed July 28, 2008, the entire contents and file wrapper of which are hereby incorporated by reference for all purposes into this application.
Field of Invention [002] The present invention generally relates to data communications systems, and more particularly to the transmission of video data with time diversity.
Background
[003] Many transmission systems, such as mobile wireless broadcast systems are subject to a difficult physical channel. In addition to fading and Doppler effects, the signal may be obstructed by buildings, trees, poles, and overpasses, among other things. Such conditions can easily cause signal loss for a period of a second or more at a receiver. [004] To combat these problems, mobile systems frequently use techniques which incorporate some form of time diversity, such as: interleaving; long block codes, such as Low Density Parity Codes (LDPC) or Turbo codes; convolutional codes; and Multi- Protocol Encapsulation combined with Forward Error Correction (MPE-FEC). Unfortunately, these systems generally incur a delay that is proportional to the time diversity. A user typically perceives this delay in the form of long channel change times, which can be highly objectionable to the user.
[005] A type of time diversity technique often used in the transmission of streams of data, such as video data, is staggercasting. Staggercasting offers a method of protection against signal loss by transmitting a secondary, redundant stream that is time-shifted with respect to a primary stream. This allows a receiver to pre-buffer packets of the secondary stream to replace packets of the primary stream lost in transmission. [006] Various staggercasting techniques exist that differ in the types of redundant data sent in the secondary stream. For example, the secondary stream may simply be an exact copy of the primary stream staggered with some time offset. Such an arrangement, however, can be inefficient as it effectively doubles the bandwidth required by the staggercast transmission.
[007] Another staggercasting technique involves the transmission of a secondary stream that is separately encoded from the primary stream. When scalable video coding is not available (for example, with a specification or standard that does not offer a scalable video codec), this secondary stream is completely independent from the primary stream and is simply a separately encoded stream representing the same source video. Because video decoders typically must maintain state data, such as previously decoded reference frames needed for decoding future frames, such a staggercasting arrangement requires a receiver to maintain two separate decoder states for each of the streams, placing additional memory burdens on the receiver.
Summary
[008] In an exemplary embodiment of the present invention, staggercasting and various coding techniques are combined to transmit a secondary coded video stream in addition to a primary coded video stream such that the secondary stream contains a subset of video frames transmitted in the primary stream. The subset of frames conveyed in the secondary stream are selected in accordance with their relative importance to other frames as determined by the coding technique by which they were encoded. More important frames are thus conveyed in the primary and secondary streams, whereas less important frames are conveyed only in the primary stream. [009] In an exemplary embodiment of the present invention, staggercasting and hierarchical predictive coding techniques are combined to transmit a secondary coded video stream in addition to a primary coded video stream such that the secondary stream contains a subset of video frames transmitted in the primary stream. The subset of frames conveyed in the secondary stream are selected in accordance with their relative importance to other frames as determined by the hierarchical predictive coding technique by which they were encoded. Frames used in decoding other frames are transmitted in both the primary and secondary streams, whereas frames not used in decoding other frames are transmitted only in the primary stream.
[0010] The impact of losing coded video data is alleviated by transmitting video reference frames in both the primary and secondary streams. Disposable video frames while transmitted in the primary stream, need not be transmitted in the secondary stream if bandwidth is limited. Staggering the two streams in time reduces the likelihood of the secondary stream data being lost along with the primary stream data. A receiver can buffer the secondary stream so that it may fall back on this data when loss of the primary stream occurs. [0011] In view of the above, and as will be apparent from reading the detailed description, other embodiments and features are also possible and fall within the principles of the invention.
Brief Description of the Figures
[0012] Some embodiments of apparatus and/or methods in accordance with embodiments of the present invention are now described, by way of example only, and with reference to the accompanying figures in which:
[0013] FIG. 1 is a block diagram of an exemplary staggercasting arrangement in which the present invention can be implemented; [0014] FIG. 2 shows a hierarchical bipredictive (B) frame structure for temporal scalable video coding; and
[0015] FIG. 3 shows an illustrative scenario in which a B frame sent redundantly in a staggercast stream is used to re-create frames lost in transmission in accordance with an embodiment of the invention.
Description of Embodiments
[0016] Other than the inventive concept, the elements shown in the figures are well known and will not be described in detail. For example, other than the inventive concept, familiarity with television broadcasting, receivers and video encoding is assumed and is not described in detail herein. For example, other than the inventive concept, familiarity with current and proposed recommendations for TV standards such as NTSC (National Television Systems Committee), PAL (Phase Alternation Lines), SECAM (SEquential Couleur Avec Memoire) and ATSC (Advanced Television Systems Committee) (ATSC), Chinese Digital Television System (GB) 20600-2006 and DVB-H is assumed. Likewise, other than the inventive concept, other transmission concepts such as eight-level vestigial sideband (8-VSB), Quadrature Amplitude Modulation (QAM), and receiver components such as a radio-frequency (RP) front-end (such as a low noise block, tuners, down converters, etc.), demodulators, correlators, leak integrators and squarers is assumed. Further, other than the inventive concept, familiarity with protocols such as Internet Protocol (IP), Real-time Transport Protocol (RTP), RTP Control Protocol (RTCP), User Datagram Protocol (UDP), is assumed and not described herein. Similarly, other than the inventive concept, familiarity with formatting and encoding methods such as Moving Picture Expert Group (MPEG)-2 Systems Standard (ISO/IEC 13818-1), H.264 Advanced Video Coding (AVC) and Scalable Video Coding (SVC) is assumed and not described herein. It should also be noted that the inventive concept may be implemented using conventional programming techniques, which, as such, will not be described herein. Finally, like-numbers on the figures represent similar elements.
[0017] FIG. 1 is a block diagram of an illustrative staggercasting environment 100 comprising a stagger transmitter 15; a communications network 20, which may include a variety of elements (e.g., networking, routing, switching, transport) operating over various media (e.g., wireline, optical, wireless); and a stagger receiver 25. A source such as a video encoder 10, provides an original stream 12 of encoded data units to the stagger transmitter 15, which, in turn, sends out a staggercast transmission for transmission over the communications network 20 for reception by the stagger receiver 25. An additional stream 13 may be included by which the encoder 10 communicates coding information to the stagger transmitter 15, as described in greater detail below.
[0018] The staggercast transmission from the transmitter 15 comprises two streams. One stream, the primary stream 16, corresponds to the original stream 12 from the source 10 and the other stream, the secondary stream 17, can be a copy of all or a portion of the primary stream. The secondary stream 17 can be time-shifted or staggered relative to the primary stream 16, in which case it may also be referred to as a "staggered" stream. Corresponding primary and secondary streams 21 and 22, respectively, are received by the stagger receiver 25. Staggering allows the receiver 25 to pre-buffer data units of the secondary stream 22 so that they may replace corresponding data units in the primary stream 21 that may have been lost or corrupted in transmission.
[0019] The primary and secondary streams 16, 17 may be combined into a single stream by a multiplexer or the like (not shown) before being provided to the network 20, conveyed as a single stream by the network 20, and de-multiplexed into streams 21 and 22 before being provided to the receiver 25. Alternatively, the primary and secondary streams can transmitted, conveyed and received as separate streams. The present invention is not limited to any specific implementation in this regard. [0020] The stagger receiver 25 is coupled to a client, such as a video decoder 30 for decoding the received video data. The decoder 30 provides a stream 35 of decoded pictures for display by a display device 40.
[0021] In accordance with an exemplary embodiment of the invention, the contents of the secondary stream 17 output from the stagger transmitter 15 are a subset of the contents of the primary stream 16. This provides a more efficient use of bandwidth over an arrangement in which the secondary stream 17 is a fully redundant stream; i.e., a complete copy of the primary stream 16. The subset of data that is conveyed in secondary stream 17 is selected in accordance with the coding scheme used to encode the data conveyed. One such scheme entails temporal scalable coding as described, for example, in H. Schwarz et al., "Analysis Of Hierarchical B Pictures and MCTF," ICME 2006 (hereinafter "Schwarz et al.").
[0022] FIG. 2 shows a hierarchical bipredictive (B) frame structure for temporal scalable video coding, as described in Schwarz et al. hi the structure depicted, frames are organized into groups of pictures (GOPs), each with eight frames. The last frame in each GOP, also known as the key frame, can be an intra-coded (I) or a predictive (P) frame. The other seven frames are B frames. The subscript of each frame label indicates the frame's level in the frame structure hierarchy, with lower subscripts indicating greater importance. FIG. 2 also indicates the orders in which the frames are coded and displayed. The order of decoding is the same as the display order. The order of transmission can be the same as the coding or the display order. In addition, the arrows shown in FIG. 2 indicate the coding inter-dependency of the various frames. Thus, as shown in FIG. 2, each GOP has one Bi frame, whose successful decoding depends on the key frame (Io/Po) of that GOP and the key frame of the previous GOP. In other words, the aforementioned key frames are reference frames for the Bi frame. The Bi frame is the second frame in the GOP to be coded, after the key frame, and the fourth frame to be decoded and displayed. The key frames can be thought of as the base layer and the Bi frames as the first enhancement layer of a temporally scalable SVC stream. [0023] Each GOP also includes two B2 frames, the first of which depends on the Bi frame and the key frame of the previous GOP, and is the third frame to be coded and the second to be decoded and displayed. The second B2 frame depends on the Bj frame and the key frame of the current GOP, and is the sixth frame in the GOP to be coded and the sixth frame to be decoded and displayed.
[0024] Finally, at the lowest level in the hierarchy, each GOP includes four B3 frames, each of which is dependent on an adjacent B2 frame and a B1 frame or a key frame, as shown in FIG. 2. [0025] Because the decoding of each frame in a GOP ultimately depends on key frames, the loss of a key frame would have the greatest impact on the resultant picture quality. As such, it is preferable that key frames are transmitted in the primary stream 16 as well as in the secondary stream 17 to protect against their loss. Similarly, because the decoding of the two B2 frames and the four B3 frames in a GOP depend on the Bi frame in the GOP, it is preferable that Bi frames also be sent in both the primary and secondary streams 16 and 17. If any of the B2 or B3 frames were to be lost, however, a decoder would still be able to decode a GOP with minimal loss in perceived video quality. In an exemplary embodiment, therefore, these frames are sent in the primary stream 16 but not in the secondary stream 17. Thus, the amount of bandwidth that the stagger transmission requires is reduced by not carrying a complete copy of the original stream but rather just a subset with the more important frames, namely, the key and Bi frames. [0026] In an exemplary embodiment, the determination of whether to include a frame in the secondary stream 17 can be based on whether or not it is a reference frame, a frame on which the decoding of other frames relies. In such an embodiment, in addition to the key and Bi frames, B2 frames are also sent in the secondary stream 17, but B3 frames are not. Doing so provides improved picture quality with a small increase in required bandwidth since there are only two B2 frames per GOP.
[0027] The hierarchical coding scheme illustrated in FIG. 2 is only one of a variety of different coding schemes that can be used with embodiments of the invention. For example, an encoder may use a coding scheme in which reference frames are generated with greater or lesser frequency than in the scheme depicted. For instance, every other frame in stream 12 can be a reference frame. Depending on the coding scheme used, reference frames can occur regularly (e.g., every Nth frame), or at varying intervals, and with different patterns. [0028] The coding scheme used by the encoder 10 is preferably selected with bandwidth efficiency in mind so as to allow the stagger transmitter 15 to select those frames for inclusion in the secondary stream which will provide the greatest value in terms of recreating lost frames in light of the additional bandwidth required to include those frames in the secondary stream. In an exemplary embodiment, bandwidth availability information can be fed-back to the encoder which can accordingly change the coding scheme that it uses in order to optimize bandwidth efficiency.
[0029] In an exemplary embodiment, the determination of which frames to include in the secondary stream 17 is made by the stagger transmitter 15. The decision to include a frame in the secondary stream 17 will depend on the characteristics (e.g., frame type, priority level) of the frame and/or available bandwidth.
[0030] The stagger transmitter 15 can determine the characteristics of each frame that it receives from the source 10 in a number of different ways. In various exemplary embodiments, the source 10 communicates frame characteristics and/or coding scheme information to downstream devices such as the stagger transmitter 15. Such information can be sent in-band, via stream 12 in the form of additional packets or header information added to encoded data units, or out-of-band, via a separate stream 13 in one or more packets.
[0031] Coding scheme information may include a variety of information about the coding scheme used so as to enable a downstream device such as the stagger transmitter 15 to determine frame characteristics. Such information may include, for example, detailed information about a segment of video data explicitly indicating the type of each frame in the segment, or it may include a few key parameters of the coding scheme used to encode the video segment (e.g., GOP size, frame structure), which devices such as the stagger transmitter 15 can use to infer frame types. The coding scheme information may be sent in the in the form of a file conveyed as payload by one or more packets, or in packet headers.
[0032] In an exemplary embodiment, the stagger transmitter 15 decodes and/or parses the headers of packets in the stream 12, typically organized as Network Abstraction Layer (NAL) units, for information indicative of one or more characteristics of each frame received from the source 10. For example, in the four-byte H.264 SVC NAL unit header structure, there is a two-bit nal ref idc (NRI) field which indicates whether the content of the NAL unit is used to reconstruct reference pictures for future prediction. NAL units with a NRI value of OO' are not used to reconstruct reference pictures for future prediction, in which case they can be lost or discarded without risking the integrity of the reference pictures in the same layer. A NRI value greater than OO' indicates that the decoding of the NAL unit is required to maintain the integrity of reference pictures in the same layer, or that the NAL unit contains parameter sets. If it is determined that a frame is a reference frame (i.e., NRI > OO'), and thus should be protected, the stagger transmitter 15 can decide to include the frame in the secondary stream 17, assuming there is available bandwidth to do so. [0033] Another field in the SVC NAL unit header that can be used to determine whether a NAL unit should be included in the secondary stream 17 is the six-bit priority id (PRID) field. A lower PRID value indicates a higher priority. The stagger transmitter 15 can select NAL units for inclusion in the secondary stream 17 based on PRID so that, for example, NAL units with a PRID value less than a threshold value will be included in the secondary stream.
[0034] In another exemplary embodiment, frame characteristic information can be conveyed using Quality of Service (QOS) or Type of Service (TOS) information (referred to herein collectively as "type-of-service" information) contained in the stream 12. In such an embodiment, the source 10 sets type-of-service bits in the headers of packets that it forwards to downstream devices such as the stagger transmitter 15. The type-of-service bits of each packet are set in accordance with the frame information contained in the packet. The stagger transmitter 15, in turn, parses the type-of-service information in the headers of encoded data units in stream 12 to determine the type of frame (e.g., key frame) being conveyed.
[0035] Note that the frame characteristics, whether conveyed by NAL, type-of-service information, or by any other means, can be determined by the stagger transmitter 15 for all frames communicated from the source 10 or a subset thereof. For example, if only key frames are to be contained in the secondary stream 17, the stagger transmitter 15 need only determine whether a frame in stream 12 is a key frame or not in deciding whether to include the frame in the secondary stream. However, even if additional frames are to be included in the secondary stream 12, such as Bi frames in the above example, the stagger transmitter can infer the positions of such frames in the stream 12 knowing the positions of the key frames. This saves the processing overhead that would otherwise be required to parse header information to identify such frames as well. [0036] The determination of whether to include frames in a stagger stream can be made by other components in a staggercasting environment as well. For example, a multiplexer in network 20 receiving the primary 16 and secondary 17 streams can identify frames, using one of the above-described techniques, and decide whether to drop or add frames from the secondary stream 17 to the multiplexer output based on frame type and/or bandwidth availability. The determination of which frames to include in the primary and secondary streams may also be done upstream, by the source 10.
[0037] FIG. 3 shows an illustrative scenario in which a Bi frame sent redundantly in a staggercast stream is used to re-create frames lost in transmission in accordance with an embodiment of the invention. As illustrated in FIG. 3, in primary stream 21, as received by stagger receiver 25, five frames in GOP(N+1) are received in error or lost in transmission. The lost frames include the Bi frame of GOP(N+ 1), in addition to the two B3 frames and the two B2 frames transmitted before and after the Bi frame. The secondary stream 22 contains copies of the key frame and the Bi frame of each GOP, designated I'/P' and Bi' respectively. In this scenario, the secondary stream is received without error. [0038] Note that in the illustrated scenario of FIG. 3, the offset between the two streams 21, 22 is shown as four data units; i.e., the secondary stream 17 is transmitted four data units earlier than the primary stream 16. For simplicity, all frames are shown in FIG. 3 to have the same transmission time. In practice, however, the size of a coded frame will vary substantially from frame to frame and thus so will the transmission time of each frame. Moreover, the stagger offset is typically expressed in terms of time rather than frames; e.g., the secondary stream frames may be transmitted four seconds earlier than their primary stream equivalents. The invention is not limited to any specific time offset. The preferred time offset for a given implementation will depend on implementation- specific details such as, for example, the amount of memory at the receiver available for buffering and error or loss characteristics. Additionally, the secondary stream can be staggered later in time from the primary stream. For practical reasons, however, the secondary stream should preferably precede the primary stream. Given that the secondary stream provides protection against loss of the primary stream, transmitting the secondary stream later in time from the primary stream would result in the protection coming some time after a data loss. Either at initial playback or upon the first loss event, the primary stream would have to pause to wait for the replacement data units from the stagger stream to arrive, resulting in a diminished viewer experience. When the secondary stream is offset earlier in time from the primary stream, as shown, the receiver can immediately begin playback of the primary stream while buffering the secondary stream to protect against future loss. [0039] In an exemplary embodiment, the primary and secondary streams may be provided with error protection (e.g., turbo coding, forward error correction, etc.) Both or only the secondary stream may be provided with error protection. The two streams may also be provided with different levels of error protection, with the secondary stream preferably being provided with a higher level of protection. It would be possible to reduce the overhead of an error protection scheme by applying it only to the secondary stream. This also offers the advantage of allowing the receiver to immediately decode and play the unprotected primary stream. Since the secondary stream is preferably received before the primary stream, there should be sufficient time to correct errors in any secondary stream data units before they may be needed to replace any lost primary stream data units. [0040] As illustrated in FIG. 3, the lost B1 frame of GOP(N+ 1) is replaced in the decoder output stream 35 by its copy Bi' received in the secondary stream 22. Additionally, the two B2 frames and two B3 frames of GOP(N+ 1) that were lost in the primary stream 21 are re-created by the decoder 30 using the frame Bi' (of the same GOP) received on the secondary stream 22. The re-created frames are designated B2* and B3* in FIG. 3. To the extent that they would be relevant to the re-creation of the missing frames, other frames received successfully in the primary or secondary stream would also be used in the re-creation. For instance, in the scenario illustrated, the key frames of GOP(N) and GOP(N+ 1) are used in the re-creation of the B2* frames. A variety of methods are available for re-creating the missing frames. For example, the missing B2 and B3 frames can be replaced with the B1' frame, or they can be estimated by applying some form of interpolation or the like. The present invention is not limited to any one particular recreation method in this regard. [0041] Note that while exemplary embodiments have been described with respect to a hierarchical B frame scheme, the principles of the present invention can be applied to any coding scheme which includes frames that can be lost and re-created from other frames without unduly compromising video quality. For example, in a further exemplary embodiment, the coding scheme is a hierarchical predictive or P-frame scheme in which each GOP comprises a key frame, one or more P frames and/or B frames. In an exemplary embodiment, the key and P frames of each GOP are transmitted in both the primary and secondary streams 16, 17.
[0042] Note that in the arrangement shown in FIG. 1, the source 10 provides a single stream 12 which is re-transmitted by the transmitter 15 as part of a staggercast transmission of two streams 16, 17. This, however, is only one of a variety of possible arrangements to which the principles of the present invention can be applied. For example, an arrangement in which the source 10 generates a staggercast transmission (with two streams) which is then received and re-transmitted by one or more staggercast transceivers could also be used with the present invention. A variety of combinations of the source 10, stagger transmitter 15 and other elements such as a multiplexer are contemplated by the present invention. [0043] Embodiments of the present invention enjoy several advantages over known approaches. As mentioned above, one staggercasting method involves the transmission of a secondary stream that is separately encoded from the primary stream. When scalable video coding is not available (for example, with a specification or standard that does not offer a scalable video codec), this secondary stream is completely independent from the primary stream and is simply a separately encoded stream representing the same source video. Typical video decoders must maintain state data, such as previously decoded reference frames that must be available for decoding future frames that are predicted from them. Where the primary and secondary streams are independent, a receiver would need to maintain two separate decoder states for each of the streams, placing additional memory burdens on the receiver. The exemplary arrangement of the present invention described above can be implemented with only one decoder and associated state memory given that the two streams are related; i.e., the secondary stream is a subset of the primary stream.
[0044] In view of the above, the foregoing merely illustrates the principles of the invention and it will thus be appreciated that those skilled in the art will be able to devise numerous alternative arrangements which, although not explicitly described herein, embody the principles of the invention and are within its spirit and scope. For example, although illustrated in the context of separate functional elements, these functional elements may be embodied in one, or more, integrated circuits (ICs). Similarly, although shown as separate elements, some or all of the elements may be implemented in a stored- program-controlled processor, e.g., a digital signal processor or a general purpose processor, which executes associated software, e.g., corresponding to one, or more, steps, which software may be embodied in any of a variety of suitable storage media. Further, the principles of the invention are applicable to various types of wired and wireless communications systems, e.g., terrestrial broadcast, satellite, Wireless-Fidelity (Wi-Fi), cellular, etc. Indeed, the inventive concept is also applicable to stationary or mobile receivers. It is therefore to be understood that numerous modifications may be made to the illustrative embodiments and that other arrangements may be devised without departing from the spirit and scope of the present invention.

Claims

1. A method of staggercast transmitting encoded video data comprising: transmitting a primary stream of encoded data units; and transmitting a secondary stream of encoded data units, wherein the secondary stream of encoded data units is transmitted with a time offset relative to the primary stream of encoded data units and includes a subset of the primary stream of encoded data units selected in accordance with a coding scheme by which the encoded data units are encoded.
2. The method of claim 1, wherein the coding scheme uses hierarchically encoded B frames.
3. The method of claim 2, wherein the secondary stream of encoded data units contains at least one of a key frame and a B frame of a first temporal enhancement layer.
4. The method of claim 1, wherein the coding scheme uses hierarchical coding of P frames.
5. The method of claim 1 , wherein the primary stream of encoded data units contains a plurality of frames in which every Nth frame is a reference picture, and wherein the secondary stream of encoded data units contains reference pictures.
6. The method of claim 1 comprising: receiving frame characteristic information relating to frames contained in the encoded data units, wherein the subset of the primary stream of encoded data units included in the secondary stream of encoded data units is selected in accordance with the frame characteristic information.
7. The method of claim 6 comprising: determining the frame characteristic information by decoding or parsing at least one encoded data unit.
8. The method of claim 6, wherein the frame characteristic information is contained in type-of-service data in headers of the encoded data units.
9. The method of claim 6, wherein the frame characteristic information is conveyed in Network Abstraction Layer (NAL) data in headers of the encoded data units.
10. The method of claim 1 comprising: receiving the encoded data units; and receiving information relating to the coding scheme by which the encoded data units are encoded.
11. The method of claim 10, wherein the information relating to the coding scheme and the encoded data units are received in different streams.
12. The method of claim 10, wherein the information relating to the coding scheme is contained in a file.
13. The method of claim 10, wherein the information relating to the coding scheme is contained in one or more network packets.
14. A method of receiving staggercast encoded video data comprising: receiving a first portion of a primary stream of encoded data units; receiving a secondary stream of encoded data units, wherein the secondary stream of encoded data units is transmitted with a time offset relative to the primary stream of encoded data units and includes a subset of the primary stream of encoded data units selected in accordance with a coding scheme by which the encoded data units are encoded; and reconstructing a second portion of the primary stream of encoded data units using the received secondary stream of encoded data units.
15. The method of claim 14, wherein reconstructing the second portion of the primary stream of encoded data units includes using the received first portion of the primary stream of encoded data units.
PCT/US2009/004406 2008-07-28 2009-07-27 Staggercasting with hierarchical coding information Ceased WO2010014239A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US8396808P 2008-07-28 2008-07-28
US61/083,968 2008-07-28

Publications (2)

Publication Number Publication Date
WO2010014239A2 true WO2010014239A2 (en) 2010-02-04
WO2010014239A3 WO2010014239A3 (en) 2010-03-25

Family

ID=41508439

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/004406 Ceased WO2010014239A2 (en) 2008-07-28 2009-07-27 Staggercasting with hierarchical coding information

Country Status (1)

Country Link
WO (1) WO2010014239A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106416251A (en) * 2014-03-27 2017-02-15 英特尔Ip公司 Scalable video encoding rate adaptation based on perceived quality
US11075965B2 (en) 2015-12-21 2021-07-27 Interdigital Ce Patent Holdings, Sas Method and apparatus for detecting packet loss in staggercasting

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107027052B (en) * 2017-02-28 2019-11-08 青岛富视安智能科技有限公司 The method and system of frame per second adaptively drop in SVC video

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003009578A2 (en) * 2001-07-19 2003-01-30 Thomson Licensing S.A. Robust reception of digital broadcast transmission

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106416251A (en) * 2014-03-27 2017-02-15 英特尔Ip公司 Scalable video encoding rate adaptation based on perceived quality
EP3123720A4 (en) * 2014-03-27 2018-03-21 Intel IP Corporation Scalable video encoding rate adaptation based on perceived quality
US11075965B2 (en) 2015-12-21 2021-07-27 Interdigital Ce Patent Holdings, Sas Method and apparatus for detecting packet loss in staggercasting

Also Published As

Publication number Publication date
WO2010014239A3 (en) 2010-03-25

Similar Documents

Publication Publication Date Title
KR101635235B1 (en) A real-time transport protocol(rtp) packetization method for fast channel change applications using scalable video coding(svc)
TWI396445B (en) Transmission/reception method of media material, encoder, decoder, storage medium, system for encoding/decoding images, electronic device and transmission device, and receiving device for decoding image
US7751324B2 (en) Packet stream arrangement in multimedia transmission
Apostolopoulos et al. Video streaming: Concepts, algorithms, and systems
CN1801944B (en) Method and device for coding and decoding video
US20110029684A1 (en) Staggercasting with temporal scalability
US8798145B2 (en) Methods for error concealment due to enhancement layer packet loss in scalable video coding (SVC) decoding
CA2656453C (en) Method allowing compression and protection parameters to be determined for the transmission of multimedia data over a wireless data channel
US8832519B2 (en) Method and apparatus for FEC encoding and decoding
EP2257073A1 (en) Method and device for transmitting video data
Greengrass et al. Not all packets are equal, part i: Streaming video coding and sla requirements
US20110090958A1 (en) Network abstraction layer (nal)-aware multiplexer with feedback
CN101166270A (en) Multimedia video communication method and system
WO2010014239A2 (en) Staggercasting with hierarchical coding information
Bystrom et al. Hybrid error concealment schemes for broadcast video transmission over ATM networks
Cai et al. Error-resilient unequal protection of fine granularity scalable video bitstreams
Hellge et al. Intra-burst layer aware FEC for scalable video coding delivery in DVB-H
Nguyen et al. Adaptive error protection for Scalable Video Coding extension of H. 264/AVC
Purandare et al. Impact of bit error on video transmission over wireless networks and error resiliency
Tian et al. Improved H. 264/AVC video broadcast/multicast
Peng et al. End-to-end distortion optimized error control for real-time wireless video streaming
Moiron et al. Enhanced slicing for robust video transmission
Wu et al. Receiver driven overlap FEC for scalable video coding extension of the H. 264/AVC
Seferoğlu Multimedia streaming over wireless channels
Farrahi Robust H. 264/AVC video transmission in 3G packet-switched networks

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09789042

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 09789042

Country of ref document: EP

Kind code of ref document: A2