[go: up one dir, main page]

EP4612911A1 - Commande de l'envoi d'au moins une image sur un réseau de communication - Google Patents

Commande de l'envoi d'au moins une image sur un réseau de communication

Info

Publication number
EP4612911A1
EP4612911A1 EP22809840.6A EP22809840A EP4612911A1 EP 4612911 A1 EP4612911 A1 EP 4612911A1 EP 22809840 A EP22809840 A EP 22809840A EP 4612911 A1 EP4612911 A1 EP 4612911A1
Authority
EP
European Patent Office
Prior art keywords
segments
video
network
picture
adaptation message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP22809840.6A
Other languages
German (de)
English (en)
Inventor
Oskar Drugge
Ying Sun
Martin Pettersson
Balázs Peter GERÖ
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget LM Ericsson AB filed Critical Telefonaktiebolaget LM Ericsson AB
Publication of EP4612911A1 publication Critical patent/EP4612911A1/fr
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64723Monitoring of network processes or resources, e.g. monitoring of network load
    • H04N21/64738Monitoring network characteristics, e.g. bandwidth, congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/24Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
    • H04N21/2402Monitoring of the downstream path of the transmission network, e.g. bandwidth available
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/637Control signals issued by the client directed to the server or network components
    • H04N21/6377Control signals issued by the client directed to the server or network components directed to server
    • H04N21/6379Control signals issued by the client directed to the server or network components directed to server directed to encoder, e.g. for requesting a lower encoding rate

Definitions

  • the present disclosure relates to the field of video transfer, and in particular to controlling the sending of at least one picture over a communication network.
  • Digital video is increasing in popularity, from broadcast services, to streaming services and individual video streaming.
  • New services employing high data rates with associated requirements for low latency are expected to be served by mobile networks in the coming years.
  • services such as cloud assisted augmented reality (AR) and remote driving, are services that may require high rates in combination with requirements for low lag in the end-user service which drives the requirement to deliver e.g. a video frame or other rich sensor data within a stipulated delay budget.
  • AR augmented reality
  • a wide area communication network e.g. comprising a cellular network
  • users of wide area networks may be located indoors, communicating with a macro tower outdoors such that the signal needs to be transmitted with a transmitter that has limited output power but still needs to get through walls/windows and reach a receiver that could be several hundred meters away.
  • Video Before video is transmitted over a communication network, it is compressed into a video bitstream using a video codec. The bitstream is then decompressed on the receiving side before further processing and/or display.
  • video codecs include H.264/AVC (advanced video coding), H.265/HEVC (high-efficiency video coding) and H.266/VVC (versatile video coding), all developed and standardized jointly by MPEG (moving picture experts group) and ITU-T (international telecommunications union - Telecommunications sector).
  • Other codecs with some market deployment include VP8, VP9 and AVI.
  • the standard way to adapt is to reduce the bitrate of the encoded video data transferred over the communication network. While this is certainly effective, the reduced bitrate also significantly decreases the quality (e.g. resolution or other encoding quality parameters) of the video data.
  • One object is to provide a way to adapt video data transfer based on network conditions while keeping the same bitrate.
  • a method for controlling the sending of at least one picture the method being performed by a video sender connected to a communication network.
  • the method comprises: receiving an adaptation message from a network condition evaluator of the communication network, wherein the adaptation message is based on network conditions on a path of the communication network between the video sender and a video receiver; adapting the number of segments of at least one future picture; and sending the segments of the at least one future picture to the video receiver.
  • the adaptation message may comprise an indication of a maximum segment size, in which case the adapting the number of segments is performed under the condition of complying with the indication of the maximum segment size.
  • the adaptation message may comprise a recommended number of segments per picture, in which case the adapting the number of segments comprises adapting the number of segments to the recommended number of segments per picture.
  • the adaptation message may comprise a recommendation for reducing the number of segments per picture or a recommendation for increasing the number of segments per picture.
  • the video sender may comprise a video encoder, in which case the adapting the number of segments of the one or more pictures, is performed by the video encoder.
  • the sending may comprise sending each one of the segments individually.
  • Each one of the segments may be at least one of a slice, tile, subpicture, decoding unit, gradual decoding refresh area, and video coding layer network abstraction layer unit.
  • the adapting the number of segments may be performed based on the adaptation message. [0015] The adapting the number of segments may be performed such that improved network conditions result in fewer segments per picture and deteriorated network conditions result in more segments per picture.
  • a video sender for controlling the sending of at least one picture, the video sender being configured to be connected to a communication network.
  • the video sender comprises: a processor; and a memory storing instructions that, when executed by the processor, cause the video sender to: receive an adaptation message from a network condition evaluator of the communication network, wherein the adaptation message is based on network conditions on a path of the communication network between the video sender and a video receiver; adapt the number of segments of at least one future picture; and send the segments of the at least one future picture to the video receiver.
  • the adaptation message may comprise an indication of a maximum segment size, in which case the instructions to adapt the number of segments comprise instructions that, when executed by the processor, cause the video sender to adapt under the condition of complying with the indication of the maximum segment size.
  • the adaptation message may comprise a recommended number of segments per picture, in which case the instructions to adapt the number of segments comprise instructions that, when executed by the processor, cause the video sender to adapt the number of segments to the recommended number of segments per picture.
  • the adaptation message may comprise a recommendation for reducing the number of segments per picture or a recommendation for increasing the number of segments per picture.
  • the video sender may comprise a video encoder, in which case the instructions to adapt the number of segments of the one or more pictures, is performed by the video encoder.
  • the instructions to send may comprise instructions that, when executed by the processor, cause the video sender to send each one of the segments individually.
  • Each one of the segments may be at least one of a slice, tile, subpicture, decoding unit, gradual decoding refresh area, and video coding layer network abstraction layer unit.
  • the instructions to adapt the number of segments may comprise instructions that, when executed by the processor, cause the video sender to adapt the number of segments based on the adaptation message.
  • the instructions to adapt the number of segments may comprise instructions that, when executed by the processor, cause the video sender to adapt the number of segments such that improved network conditions result in fewer segments per picture and deteriorated network conditions result in more segments per picture.
  • a computer program for controlling the sending of at least one picture.
  • the computer program comprises computer program code which, when executed on a video sender causes the video sender to: receive an adaptation message from a network condition evaluator of the communication network, wherein the adaptation message is based on network conditions on a path of the communication network between the video sender and a video receiver; adapt the number of segments of at least one future picture; and send the segments of the at least one future picture to the video receiver.
  • a computer program product comprising a computer program according to the third aspect and a computer readable means comprising non-transitory memory in which the computer program is stored.
  • a method for controlling the sending of at least one picture the method being performed in a network condition evaluator connected to a communication network.
  • the method comprises: evaluating network conditions of at least part of a path of the communication network between a video sender and a video receiver; generating an adaptation message based on the network conditions, wherein the adaptation message comprises an indication for the video sender to adapt the number of segments in at least one future picture; and sending the adaptation message to the video sender.
  • the generating an adaptation message may comprise determining a recommended number of segments per picture to be sent by the video sender, in which case the adaptation message comprises the recommended number of segments per picture.
  • the generating an adaptation message may comprise reducing the recommended number of segments per picture when network conditions have improved and increasing the recommended number of segments per picture when network conditions have deteriorated. [0030] The generating an adaptation message may comprise determining the recommended number of segments based on a proportion of congestion marked packets.
  • a congestion marked packet may be detected using an Explicit Congestion Notification of a header of an internet protocol packet evaluated by the network condition evaluator.
  • the adaptation message may comprise a proportion of congestion marked packets of all packets detected between the video sender and the video receiver.
  • the generating an adaptation message may comprise determining a minimum segment size that can be reliably delivered within a set packet delay budget.
  • the determining a minimum segment size may be made based on at least one of: time-division duplex pattern, signal strength measurements, interference measurements, power headroom measurements, and queue time.
  • the generating an adaptation message may comprise determining a maximum segment size for segments of pictures to be sent by the video sender, in which case the adaptation message comprises an indication of the maximum segment size.
  • the adaptation message may comprise a recommendation for reducing the number of segments per picture.
  • the adaptation message may comprise a recommendation for increasing the number of segments per picture.
  • the evaluating network conditions may be based on metrics of the communication network for at least one of: queue build-up, channel conditions, interference situation, load conditions, power headroom indications, numerology, and Time Division Duplex pattern.
  • the network condition evaluator may be a radio base station, a router or a gateway.
  • the network condition evaluator may be the video receiver.
  • a network condition evaluator for controlling the sending of at least one picture, the network condition evaluator being configured to be connected to a communication network.
  • the network condition evaluator comprises: a processor; and a memory storing instructions that, when executed by the processor, cause the network condition evaluator to: evaluate network conditions of at least part of a path of the communication network between a video sender and a video receiver; generate an adaptation message based on the network conditions, wherein the adaptation message comprises an indication for the video sender to adapt the number of segments in at least one future picture; and send the adaptation message to the video sender.
  • the network condition evaluator may be a radio base station, a router or a gateway.
  • the network condition evaluator may be the video receiver.
  • a computer program for controlling the sending of at least one picture.
  • the computer program comprises computer program code which, when executed on a network condition evaluator being configured to be connected to a communication network, causes the network condition evaluator to: evaluate network conditions of at least part of a path of the communication network between a video sender and a video receiver; generate an adaptation message based on the network conditions, wherein the adaptation message comprises an indication for the video sender to adapt the number of segments in at least one future picture; and send the adaptation message to the video sender.
  • a computer program product comprising a computer program according to the seventh aspect and a computer readable means comprising non-transitory memory in which the computer program is stored.
  • FIG. 1 is a schematic diagram illustrating an environment in which embodiments presented herein can be applied.
  • FIGs 2A-B are schematic diagrams illustrating various ways of segmenting a picture
  • Figs 3A-B are schematic diagrams illustrating embodiments of where the network condition evaluator can be implemented
  • Fig 4 is a is swimlane diagram illustrating embodiments of methods for controlling the sending of at least one picture
  • Fig 5 is a schematic diagram illustrating components of the video sender and/or the network condition evaluator of Fig 1;
  • Fig 6 is a schematic diagram showing functional modules of the video sender of Fig 1 according to one embodiment
  • Fig 7 is a schematic diagram showing functional modules of the network condition evaluator of Fig 3 A or 3B according to one embodiment.
  • Fig 8 shows one example of a computer program product comprising computer readable means.
  • FIG. 1 is a schematic diagram illustrating an environment in which embodiments presented herein can be applied.
  • a video sender 2 is an electronic device that is capable of capturing, encoding and sending a bitstream for video data.
  • the video sender 2 can e.g. be implemented using any one or more of a smartphone, a wearable device (such as a head mounted display, HMD), a tablet computer, a laptop computer, a desktop computer or a server.
  • the video sender 2 comprises a video source 20 and an encoder 21.
  • the video source 20 can e.g. be a camera that captures a video stream from its environment.
  • the video stream is made up of a sequence of pictures (also known as frames, used interchangeably herein).
  • the encoder 21 is configurable in terms of segment size and/or the number of segments per picture.
  • the video sender 2 may also comprise a packetizer that is able to packetize the bitstream into segments.
  • the encoder 21 takes the video stream as input and encodes each picture of the video stream into a coded video bitstream. As explained in more detail below, each picture can be split into segments in various ways.
  • the video sender 2 sends the segments of the pictures in encoded video data 15 to a video receiver 5 via a communication network 9.
  • the communication network 9 can comprise a wide area network 6, such as the Internet, and optionally includes communication via Wi-Fi or a cellular network.
  • the communication network 9 can support Internet Protocol (IP) traffic.
  • IP Internet Protocol
  • the cellular communication network can e.g. comply with any one or a combination of sixth generation (6G) mobile networks, fifth generation, (5G) mobile networks, LTE (Long Term Evolution), UMTS (Universal Mobile Telecommunications System) utilising W-CDMA (Wideband Code Division Multiplex), or any other current or future wireless network, as long as the principles described hereinafter are applicable.
  • 6G sixth generation
  • 5G fifth generation
  • UMTS Universal Mobile Telecommunications System
  • W-CDMA Wideband Code Division Multiplex
  • the communication network 9 can comprise any number of intermediary network nodes Ion a path through the communication network 9 between the video sender 2 and the video receiver 5.
  • Each one of the network nodes lean e.g. be a radio base station (of a cellular network), a router or a gateway.
  • the video receiver 5 receives the encoded video data 15, decodes the encoded video data 15 in a decoder 23 and outputs the decoded video for further processing, e.g. for machine analysis and/or presents the decoded video to a human using a rendering device 24.
  • the video receiver 5 can be implemented in hardware and/or software on an end-user device, such as a mobile phone, a computer, a head mounted display (HMD).
  • the video receiver 5 comprises a video decoder that decodes the video bitstream and may comprise a display to display the decoded video.
  • the video receiver 5 may be located in another node in the network providing a cloud service, e.g., it may be in an edge cloud, or a centrally located server.
  • the video receiver 5 then decodes the segments of the video, process the video, possibly before encoding it again and sending it to an end-user device for display.
  • Network conditions of at least part of the communication network 9 can be evaluated at either one or both of the network nodes 1 and the video receiver 5. Based on the network conditions, the network nodes 1 and/or the video receiver 5 sends an adaptation message 30 to the video sender 2, causing the video sender to adapt the number of segments per picture.
  • L4S Low Latency Low Loss Scalable throughput
  • Any node serving an L4S capable data flow may set the Explicit Congestion Notification (ECN) bits in the IP header for the flow if congestion is experienced.
  • the receiving node e.g. the video receiver 5
  • the receiving node collects the congestion/ECN statistics and feedbacks this to the corresponding sending node (e.g. the video sender 2).
  • Common protocols for real time streaming are the Real Time Transport Protocol (RTP) used together with RTCP (Real Time Control Protocol) to convey feedback information.
  • RTP Real Time Transport Protocol
  • RTCP Real Time Control Protocol
  • Packets are marked as congested when queue delays are very low which gives a prompt reaction to small signs of congestion, allowing the end hosts to implement scalable congestion control where the transmission rate (or congestion window) is changed in proportion to the fraction of congestion-marked packets.
  • L4S enables real-time critical data applications to adapt their rate to the weakest link, providing reduced latency impact due to queue build up.
  • L4S is often triggered by thresholds in the input queue of a network node and may be used to signal a congested situation. Given that most network nodes have a fairly stable or slowly varying output rate, this provides good results. For radio networks, the output rate variations over the wireless link may be more frequent than in traditional wired solutions, which leads to sudden latency peaks even when L4S is used. Nevertheless, there could be more or less advanced schemes to derive the L4S marking. For instance, an expected queue can be determined using predictive methods, which may be able to flag congestion even before it has occurred.
  • the access network can recommend access network bitrate recommendations (ANBR) to MTSI (Multimedia Telephony Service for IP multimedia system) clients in a terminal.
  • ANBR access network bitrate recommendations
  • Some access networks may provide the MTSI client in terminal with ANBR messages, separately per local access bearer and separately for the local uplink and downlink. It is expected that an ANBR message is sent to the MTSI client in terminal whenever the access network finds it reasonable to inform about a change in the recommended bitrate, such that the MTSI client in terminal is generally provided with up-to-date recommended bitrate information.
  • a single access bearer can carry multiple RTP streams, in which case ANBR applies to the sum of the individual RTP stream bitrates on that bearer.
  • Access networks supporting ANBR may also support a corresponding ANBR Query (ANBRQ) message, which allows the MTSI client in terminal to query the network for updated ANBR information.
  • ANBRQ shall only be used to query for an ANBR update when media bitrate is to be increased, not for media bitrate decrease.
  • the recommended bitrate procedure is used to provide the MAC (Medium Access Control) entity with information about the bitrate recommended by the radio base station.
  • the recommended bitrate indicates the bitrate that the radio condition can support at the physical layer.
  • An averaging window of default value 2000 ms will apply as specified in 3GPP TS 26.114.
  • FIGs 2A-B are schematic diagrams illustrating various ways of segmenting a picture 10.
  • a video codec often supports both intra-coded and inter-coded pictures.
  • An intra-coded picture may only predict from samples of the same picture, whereas inter-coded pictures may also predict from previously decoded pictures, referred to as reference pictures.
  • Inter-coded pictures may be divided into P-pictures, which may only predict from one reference picture at a time for each coding block and bidirectional B-pictures, which may predict from up to two reference pictures simultaneously for each coding block.
  • the video bitstream In order to tune into a bitstream or recover from lost or corrupted frames, the video bitstream often includes periodic intra-coded pictures, often referred to as a key frames in video coding.
  • AVC HEVC and VVC such pictures are called intra random access point (IRAP) pictures.
  • AVC, HEVC and VVC also supports tuning into the bitstream at an inter-coded picture using gradual decoding refresh (GDR), where each picture from the tune-in point often refreshes a new part of the picture, by coding that area with intra-coded blocks, referred to here as GDR clean area, until the whole picture has been refreshed. Areas that have not yet been refreshed are referred to as dirty areas.
  • GDR gradual decoding refresh
  • GDR is normatively supported in VVC using the GDR picture type, and optionally supported in AVC and HEVC using the recovery point supplemental enhancement information (SEI) message.
  • SEI recovery point supplemental enhancement information
  • the pictures in the video bitstream from an IRAP/GDR picture to the next IRAP/GDR picture is referred to as a coded video sequence.
  • Block-based video codecs such as AVC, HEVC and VVC often divides the picture into largest coding units (LCUs) which may be further subdivided into smaller coding units (CUs) before applying prediction and transform coding.
  • the largest coding unit in AVC is called macro block (MB) and comprises 16x16 pixels and in HEVC and VVC it is called coding tree unit (CTU) and can comprise up to 128x128 pixels.
  • MB macro block
  • CTU coding tree unit
  • Many video codecs, including AVC, HEVC and VVC support partitioning of a picture into independently coded segments.
  • a picture 10 is shown split into a number of segments 1 la-e in the form of picture slices.
  • AVC, HEVC and VVC support segmentation of s picture in raster scan slices, where a picture is divided into slices based on the raster scan order of the MBs for AVC, the CTUs for HEVC and tiles (see below) for VVC.
  • Each slice includes a slice header and the slices are packed into video coding layer (VCL) network abstraction layer (NAL) units.
  • VCL video coding layer
  • NAL network abstraction layer
  • Raster scan slices are useful for dividing a picture into segments of somewhat similar number of bits. This is for instance useful when the bit size of the picture is larger than the maximum transmission unit (MTU). For ethemet, the standard MTU size is 1500 bytes. Slices may also be useful for error robustness reasons. It is often easier to perform error concealment on a lost slice than on a lost picture. AVI also uses a concept similar to slices, where each part of the picture is referred to as a segment.
  • MTU maximum transmission unit
  • a picture 10 is shown split into a number of segments 1 la-1 in the form of tiles.
  • HEVC and VVC supports tile segmentation, where a picture is divided into an MxN grid of tiles, where each tile includes one or more whole CTUs.
  • a slice may contain multiple tiles or a tile may be divided into one or more slices, for HEVC raster scan slices and for VVC something called rectangular slices. In contrast to slices, tiles do not have a header.
  • Fig 2B below shows an example of a picture 10 divided into twelve tiles. Dividing a picture into equally spatially sized segments, which may be easily done with tiles, is useful for parallel processing in the encoding and/or decoding of the video.
  • VVC also supports a concept called subpictures.
  • a subpicture contains one or more slices that collectively cover a rectangular region of a picture.
  • the subpicture partitioning is specified on a sequence level, i.e., it is valid for all pictures from one IRAP picture to another, whereas the slice partitioning and tile partitioning may be changed for each picture.
  • the subpicture layout may not change between pictures in a sequence and does not have any temporal dependencies with the other subpictures in a picture making it possible to extract subpicture streams from a video bitstream and even merge subpicture streams with subpicture streams from another video bitstream.
  • segment Unless explicitly stated otherwise, we herein use the term segment to describe any type of picture partitioning including, but not limited to, slices, tiles, subpictures, GDR area, VCL NAL unit and segments as used in AVI.
  • Dividing a picture into independently decodable segments may be beneficial in various ways as described above.
  • the required bitrate to achieve the same quality increases, partly due to bitrate overhead from the additional segment headers, but also due to worse prediction as there are less samples to predict from. This effect becomes worse the smaller the segments are, and especially when there is fast-moving content.
  • the throughput can be increased by increasing the output power so as to improve the received SINR (signal to interference and noise ratio), but this can only be done up to the point where the transmitter is transmitting at its maximum output power. Beyond this point, to support a higher rate, with withheld latency performance, the SINR can be marginally improved by means such as using larger antenna arrays with more directivity, reducing interference by interference coordination schemes, reducing the noise figure of the receiver.
  • the required rate of transmission may differ significantly from the required rate of transmission for a service with (at least piecewise) continuously streaming data with no or very relaxed latency requirements.
  • Frames may be generated with an interarrival time corresponding to the frame rate of the application and the interarrival time can be larger than the required delay budget.
  • the delay budget will need to cover the following parts for efficient communication:
  • Data indication delay The delay from when data arrives in the buffer at gNB/UE (user equipment) side until the scheduler is made aware of it.
  • Queuing delay The delay that may occur because the scheduler is prioritizing other users/services even though it is aware of the data in the buffer.
  • Transmission delay The delay to transfer the data burst, which may need to be serialized into multiple sections, each transmitted in one radio slot. Depending on the scheduler approach there may be queuing delay for each individual section/slot (e.g. round robin scheduling).
  • Margin for retransmission of last packets The delay towards the end, where even though all segments have had their first transmission, there may still be some that failed and are in the process of being retransmitted using HARQ (hybrid automatic repeat request).
  • the size of the frame being sent i.e. the minimum volume of data that needs to be transmitted within the latency budget will be greatly affecting the coverage for an XR service operating over a wireless network.
  • the frame size of a service is tightly connected to the operation of the respective service at the application level.
  • an AR service may perform multiple functions that are all needed for the service, e.g. object detection, object tracking, mapping, localization and more. Each of these functions may be performed either internally, in a head mounted display (HMD), or with part of the processing being done in an edge computing cloud server.
  • HMD head mounted display
  • the sensors located in the AR HMD would collect data (e.g. video frames, depth sensor frames) and send the sensor input to the edge cloud for further processing.
  • the edge cloud processing can now be done in different ways, assuming that more or less of the full snapshot of sensor data is processed in one go.
  • a picture of the full view would be collected at time tO at the HMD. This picture could according to a first alternative be transmitted in full across the mobile network to an edge cloud and the processing of object detection using the picture could start once the full picture is available.
  • the picture could instead be partitioned into multiple segments, where each of the segments would be sent sequentially to the edge cloud, and the processing of object detection could be done using a segment-by-segment approach.
  • the second alternative may have drawbacks, as the processing would have less information to work with.
  • object detection done sequentially with multiple segments of a picture
  • the algorithm may fail to identify an object that happened to he just on the border between two segments.
  • the segment-by- segment processing could greatly reduce the size of picture frame that the mobile network would need to deliver within a stipulated delay budget, thereby greatly increasing the coverage given a fixed network deployment.
  • Assessments on e.g. queue build-up, channel conditions, interference situation, load conditions, capability of the used numerology, TDD (time-division duplex) pattern and feature set could be done by a network node within the RAN. Based on these assessments, the network node would signal to the source encoder (video sender) that it needs a more granular (more segments per picture) representation of the source sensor data (e.g. picture), as the current granularity of representation cannot be supported given the current deployment, configuration and conditions.
  • the source encoder video sender
  • Signalling could either be through an API (application programming interface) exposed at application layer or using signalling similar or identical to L4S described above, that involves toggling of a bit in a communication frame header with increasing or decreasing frequency to indicate the need for a less or more granular source sensor representation.
  • API application programming interface
  • Signalling could be either differential in the sense that it could represent code-words for increasing granularity (more segments per picture) of video encoding or decreasing granularity (fewer segments per picture) of video encoding. Signalling from application to network about the minimum segment size or maximum frequency of the segment to assist state switching decision in network between to adapt segment size and frequency with the same target data rate or to adapt application data rate. When minimum segment size or maximum frequency of segments is not reached, the access network could toggle increasing the frequency of the segments; but when minimum segment size or maximum frequency of segments are reached, and there is queued data in the buffer, the network could trigger the recommendation and send signal to application to reduce application data rate via L4S ECN toggling or ANBRQ signal.
  • Embodiments presented herein include a system in which network conditions are evaluated, e.g. based on queue build-up, channel conditions, interference situation, load conditions, optionally combined with RAN internal information on configuration such as used numerology, TDD pattern and feature set. This evaluation is used to affect the number of segments per picture at the video sender.
  • the embodiments presented herein comprise signalling information from the network condition evaluator 3 to the video sender 2.
  • the video sender 2 uses this recommendation to increase, decrease or set the granularity in which the video sender 2 breaks the representation of one snapshot of sensor data (e.g. a picture) into smaller, individually consumable data units (segments).
  • the video sender 2 increases or decreases the number of segments used to encode a video frame and the corresponding video stream parser and decoder in the video receiver 5 starts decoding the received slice as soon as the slice arrives.
  • the change of processing granularity may trigger further configuration changes in other components of the application pipeline, e.g. in a SLAM (simultaneous localisation and mapping) or in object detection components.
  • SLAM simultaneous localisation and mapping
  • Figs 3A-B are schematic diagrams illustrating embodiments of where the network condition evaluator 3 can be implemented.
  • FIG 3 A the network condition evaluator 3 shown as implemented in the network node 1.
  • the network node 1 is thus the host device for the network condition evaluator 3 in this implementation.
  • Fig 3B the network condition evaluator 3 shown as implemented in the video receiver 5.
  • the video receiver 5 is thus the host device for the network condition evaluator 3 in this implementation.
  • Fig 4 is a is swimlane diagram illustrating embodiments of methods for controlling the sending of at least one picture.
  • the swimlane diagram can be considered to comprise a flow chart for methods in the video sender 2 in the left swimlane, and a flow chart for methods in the network condition evaluator 3 in the centre swimlane. Communication between the video sender 2, the network condition evaluator 3 and the video receiver 5 is also shown.
  • the video sender 2, network condition evaluator 3, and video receiver 5 communicate via a communication network 9, such as the communication network 9 of Fig 1.
  • the network condition evaluator 3 can be a node separate from the video receiver 5.
  • the network condition evaluator 3 can be a network node, e.g. a radio base station, a router or a gateway.
  • the network condition evaluator 3 is the video receiver 5, i.e. the network condition evaluator 3 is implemented within the video receiver 5.
  • the network condition evaluator 3 evaluates network conditions of at least part of a path of the communication network 9 between a video sender 2 and a video receiver 5.
  • the evaluating of network conditions can be based on metrics of the communication network 9 for at least one of: queue build-up, channel conditions, interference situation, load conditions, power headroom indications, numerology, and TDD pattern.
  • a generate adaptation message step 52 the network condition evaluator 3 generates an adaptation message based on the network conditions.
  • the adaptation message comprises an indication for the video sender 2 to adapt the number of segments in at least one future picture.
  • the generating the adaptation message comprises determining a recommended number of segments per picture to be sent by the video sender, in which case the adaptation message comprises the recommended number of segments per picture.
  • the generating the adaptation message comprises determining a recommendation of whether to increase or decrease the number of segments per picture to be sent by the video sender, in which case the adaptation message comprises an indication to increase or decrease the number of segments per picture.
  • the video receiver 5 may comprise a service that is highly reliant on low-latency video such as augmented reality (AR) or other XR-based services, cloud gaming or remote controlling of e.g., drones and industry robots.
  • AR augmented reality
  • XR-based services cloud gaming or remote controlling of e.g., drones and industry robots.
  • the adaptation message may indicate a recommended number of segments, or a recommended segment size for the segments in one or more pictures, where the segment size may be measured in bits or as a spatial resolution, in terms of e.g. pixels or coding blocks.
  • the information may also indicate a recommendation to increase/decrease the number of segments per picture or the segment size.
  • the recommended number of segments may indicate a maximum or minimum number of recommended segments.
  • the recommended segment size may indicate a maximum or minimum recommended segment size.
  • a segment may be one of a slice, tile, subpicture, decoding unit, GDR area, and VCL NAL unit.
  • the information in the adaptation message may be deduced from network condition metrics, e.g., based on queue build-up, channel conditions, interference situation, load conditions are measured and tracked, and/or RAN internal information on configuration such as used numerology, TDD pattern and feature set.
  • network condition metrics e.g., based on queue build-up, channel conditions, interference situation, load conditions are measured and tracked, and/or RAN internal information on configuration such as used numerology, TDD pattern and feature set.
  • Information may also be received describing a proportion of congestion marked packets, indicating how to adapt the segment size, e.g. increasing or decreasing the segment size.
  • the proportion of congestion marked packets may be determined at the receiver end and fed back to the video sending entity.
  • a congestion marked packet may e.g. be indicated using the ECN bits as part of a (IPv6 or IPv4) header of an IP packet. Any network node between the sender and receiver may mark packets as congested as a response to buffer build up at that node.
  • Signalling from the video receiver 5 to the network condition evaluator 3 about the minimum segment size or maximum frequency of the segment to assist state switching decision in network may be used to adapt segment size and frequency with the same target data rate or to adapt application data rate.
  • the network condition evaluator 3 could toggle increasing the the number of segments per picture; but when minimum segment size or maximum frequency of segments are reached, and there is queued data in the buffer, the network condition evaluator 3 could trigger the recommendation and send signal to application to reduce application data rate via L4S ECN toggling or ANBRQ signal.
  • the generating the adaptation message comprises reducing the recommended number of segments per picture when network conditions have improved and increasing the recommended number of segments per picture when network conditions have deteriorated.
  • the generating an adaptation message comprises determining the recommended number of segments based on a proportion of congestion marked packets.
  • the congestion marked packet can e.g. be detected using an Explicit Congestion Notification of a header of an IP packet evaluated by the network condition evaluator 1.
  • the adaptation message comprises a proportion of congestion marked packets of all packets detected between the video sender and the video receiver, enabling the video sender to adapt its segmentation based on this proportion.
  • the adaptation message comprises a minimum segment size that can be reliably delivered within a set packet delay budget.
  • the minimum segment size can be determined based on at least one of: time-division duplex pattern, signal strength measurements, interference measurements, power headroom measurements, and queue time.
  • the adaptation message comprises an indication of the maximum segment size, wherein the adaptation message is determining with the maximum segment size for segments of pictures to be sent by the video sender.
  • the adaptation message comprises a recommendation for reducing the number of segments per picture or for increasing the number of segments per picture.
  • a send adaptation message step 54 the network condition evaluator 3 sends the adaptation message 30 to the video sender 2.
  • the adaptation message 30 can be signalled through an API exposed at application layer or signalled in a similar way as L4S, that involves toggling of a bit in a communication frame header with increasing or decreasing frequency to indicate the need for a less or more granular source sensor representation.
  • a receive adaptation message step 40 the video sender 2 receives the adaptation message 30 from the network condition evaluator 3 of the communication network 9.
  • the adaptation message 30 is based on network conditions on a path of the communication network 9 between the video sender 2 and a video receiver 5.
  • the adaptation message 30 can comprise an indication of a maximum segment size.
  • the adaptation message 30 can comprise a recommended number of segments per picture.
  • the adaptation message 30 can comprise a recommendation for reducing the number of segments per picture or a recommendation for increasing the number of segments per picture.
  • an adapt step 42 the video sender 2 adapts the number of segments of at least one future picture.
  • the adapting the number of segments is performed based on the adaptation message.
  • the adaptation message comprises the maximum segment size
  • the adapting the number of segments is performed under the condition of complying with the indication of the maximum segment size.
  • the adapting the number of segments comprises adapting the number of segments to the recommended number of segments per picture.
  • the video sender 2 can comprise a video encoder 21, in which case the adapting the number of segments of the one or more pictures, can be performed by the video encoder.
  • a first value in the adaptation message triggers the encoder to encode one segment per picture and a second value in the adaptation message triggers the encoder to encode multiple segments per picture.
  • the number of multiple segments is a predetermined fixed number.
  • Each one of the segments can be at least one of a slice, tile, subpicture, decoding unit, gradual decoding refresh area, and video coding layer network abstraction layer unit, as exemplified above.
  • the adapting the number of segments can be performed such that improved network conditions result in fewer segments per picture 10 and deteriorated network conditions result in more segments per picture 10.
  • a send segments step 44 the video sender 2 sends the segments 32 of the at least one future picture to the video receiver 5, suitably packetized and provided in a video bitstream.
  • the sending can comprise sending each one of the segments individually.
  • a receive segments step 58 the video receiver 5 receives the video bitstream from the video sender 2 (via the communication network 9) and extracts the segments. This allows the video receiver 5 to reconstruct the video picture(s) and render the video for a user (or machine) to see.
  • the video receiver 5 decodes the coded segments in the received video bitstream one-by-one and forwards the decoded segments one-by-one for further processing to a processing entity, e.g. a SLAM or object detection algorithm, which processes the picture before receiving all decoded segments of the picture.
  • a processing entity e.g. a SLAM or object detection algorithm
  • the usage coverage for time-critical high bitrate services is extended, allowing the video bitstream to be provided within a delay budget without reducing video encoding bitrate.
  • the video sender 2 can adjust segmentation dynamically. As long as the RAN has enough capacity to offer a big TBS (transport block size) that can often carry a whole picture, the video sender 2 can avoid segmenting the picture. In this case, the video sender 2 can keep the data bandwidth usage optimal, because without segmentation, the video compression efficiency is the best due to low overhead. In addition, avoiding segmentation may have benefits in terms of power saving, since larger chunks transmitted more seldomly makes it more likely that a DRX (discontinuous reception) pattern (that enables the radio base station and/or UE to go into micro-sleep mode) can be applied.
  • DRX discontinuous reception
  • segmentation provides a means to keep the latency requirements in the RAN, at the expense of slightly increased data bandwidth requirements caused by a less efficient compression method and/or increased power consumption at the device/radio base station.
  • embodiments presented herein allow the video sender 2 to use the RAN also in situations with worse network conditions.
  • congestion control solutions in the art respond with gradual quality degradation to deteriorating network conditions.
  • the embodiments presented herein allow a complementary way to respond to deteriorating network conditions without quality degradations, whereby this method is preferable as a first response to deteriorating network conditions.
  • Fig 5 is a schematic diagram illustrating components of the video sender 2 and/or the network condition evaluator 3 of Fig 1. It is to be noted that when the network condition evaluator 3 is implemented in a host device, one or more of the mentioned components can be shared with the host device.
  • a processor 60 is provided using any combination of one or more of a suitable central processing unit (CPU), graphics processing unit (GPU), multiprocessor, neural processing unit (NPU), microcontroller, digital signal processor (DSP), etc., capable of executing software instructions 67 stored in a memory 64, which can thus be a computer program product.
  • the processor 60 could alternatively be implemented using an application specific integrated circuit (ASIC), field programmable gate array (FPGA), etc.
  • the processor 60 can be configured to execute the methods described with reference to Fig 4 above.
  • the memory 64 can be any combination of random-access memory (RAM) and/or read-only memory (ROM).
  • the memory 64 also comprises non-transitory persistent storage, which, for example, can be any single one or combination of magnetic memory, optical memory, solid-state memory or even remotely mounted memory.
  • a data memory 66 is also provided for reading and/or storing data during execution of software instructions in the processor 60.
  • the data memory 66 can be any combination of RAM and/or ROM.
  • An I/O interface 62 is provided for communicating with external and/or internal entities using wired communication, e.g. based on Ethernet, and/or wireless communication, e.g. Wi-Fi, and/or a cellular network, complying with any one or a combination of sixth generation (6G) mobile networks, next generation mobile networks (fifth generation, 5G), LTE (Long Term Evolution), UMTS (Universal Mobile Telecommunications System) utilising W-CDMA (Wideband Code Division Multiplex), or any other current or future wireless network, as long as the principles described hereinafter are applicable.
  • 6G sixth generation
  • 5G next generation mobile networks
  • LTE Long Term Evolution
  • UMTS Universal Mobile Telecommunications System
  • W-CDMA Wideband Code Division Multiplex
  • Fig 6 is a schematic diagram showing functional modules of the video sender 2 of Fig 1 according to one embodiment.
  • the modules are implemented using software instructions such as a computer program executing in the video sender 2.
  • the modules are implemented using hardware, such as any one or more of an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or discrete logical circuits.
  • the modules correspond to the steps in the methods for the video sender 2 illustrated in Fig 4.
  • An adaptation message receiver 70 corresponds to step 40.
  • An adaptor 72 corresponds to step 42.
  • a segment sender 74 corresponds to step 44.
  • Fig 7 is a schematic diagram showing functional modules of the network condition evaluator 3 of Fig 3 A or 3B according to one embodiment.
  • the modules are implemented using software instructions such as a computer program executing in the network condition evaluator 3.
  • the modules are implemented using hardware, such as any one or more of an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or discrete logical circuits.
  • the modules correspond to the steps in the methods for the network condition evaluator 3 illustrated in Fig 4.
  • a network condition evaluator 80 corresponds to step 50.
  • An adaptation message generator 82 corresponds to step 52.
  • An adaptation message sender 84 corresponds to step 54.
  • Fig 8 shows one example of a computer program product 90 comprising computer readable means.
  • a computer program 91 can be stored in a non-transitory memory.
  • the computer program can cause a processor to execute a method according to embodiments described herein.
  • the computer program product is in the form of a removable solid-state memory, e.g. a Universal Serial Bus (USB) drive.
  • USB Universal Serial Bus
  • the computer program product could also be embodied in a memory of a device, such as the computer program product 64 of Fig 5.
  • While the computer program 91 is here schematically shown as a section of the removable solid-state memory, the computer program can be stored in any way which is suitable for the computer program product, such as another type of removable solid-state memory, or an optical disc, such as a CD (compact disc), a DVD (digital versatile disc) or a Blu-Ray disc.
  • an optical disc such as a CD (compact disc), a DVD (digital versatile disc) or a Blu-Ray disc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

L'invention concerne un procédé de commande de l'envoi d'au moins une image, le procédé étant mis en œuvre par un émetteur vidéo connecté à un réseau de communication. Le procédé consiste à : recevoir un message d'adaptation en provenance d'un évaluateur de conditions de réseau du réseau de communication, le message d'adaptation étant basé sur des conditions de réseau sur un trajet du réseau de communication entre l'émetteur vidéo et un récepteur vidéo ; adapter le nombre de segments d'au moins une future image ; et envoyer les segments de la ou des futures images au récepteur vidéo. Des modes de réalisation pour le côté évaluateur de conditions de réseau sont également décrits.
EP22809840.6A 2022-10-31 2022-10-31 Commande de l'envoi d'au moins une image sur un réseau de communication Pending EP4612911A1 (fr)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2022/080362 WO2024094277A1 (fr) 2022-10-31 2022-10-31 Commande de l'envoi d'au moins une image sur un réseau de communication

Publications (1)

Publication Number Publication Date
EP4612911A1 true EP4612911A1 (fr) 2025-09-10

Family

ID=84361450

Family Applications (1)

Application Number Title Priority Date Filing Date
EP22809840.6A Pending EP4612911A1 (fr) 2022-10-31 2022-10-31 Commande de l'envoi d'au moins une image sur un réseau de communication

Country Status (2)

Country Link
EP (1) EP4612911A1 (fr)
WO (1) WO2024094277A1 (fr)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63304745A (ja) * 1987-06-05 1988-12-13 Nec Corp パケットサイズ適応化方式
JP3403683B2 (ja) * 1999-11-26 2003-05-06 沖電気工業株式会社 画像符号化装置および方法
JP5830543B2 (ja) * 2010-11-10 2015-12-09 エヌイーシー ヨーロッパ リミテッドNec Europe Ltd. 輻輳公開対応ネットワークで輻輳管理をサポートする方法
US11451485B2 (en) * 2020-03-27 2022-09-20 At&T Intellectual Property I, L.P. Dynamic packet size adaptation to improve wireless network performance for 5G or other next generation wireless network

Also Published As

Publication number Publication date
WO2024094277A1 (fr) 2024-05-10

Similar Documents

Publication Publication Date Title
CN100539544C (zh) 媒体流式传输分发系统
JP5588019B2 (ja) 信頼性のあるデータ通信のためにネットワーク抽象化レイヤを解析する方法および装置
US20150373075A1 (en) Multiple network transport sessions to provide context adaptive video streaming
CN101222296A (zh) 上行蜂窝视频通信中自适应的传输方法及系统
CN103957389A (zh) 基于压缩感知的3g视频传输方法及系统
CA3233498A1 (fr) Configuration de reseau d'acces radio sensible au codec video et codage de protection contre les erreurs inegales
US10085029B2 (en) Switching display devices in video telephony
CN107079132B (zh) 在视频电话中的端口重配置之后馈送经帧内译码的视频帧
AU2019201095A1 (en) System and method for automatic encoder adjustment based on transport data
CN107210843A (zh) 使用喷泉编码的实时视频通信的系统和方法
KR101953580B1 (ko) 영상회의 시스템에서 데이터 송수신 장치 및 방법
da Silva et al. Preventing quality degradation of video streaming using selective redundancy
Chen et al. Robust video streaming over wireless LANs using multiple description transcoding and prioritized retransmission
WO2024094277A1 (fr) Commande de l'envoi d'au moins une image sur un réseau de communication
US8548030B2 (en) Relay apparatus
Wang et al. Error resilient video coding using flexible reference frames
Ozbek et al. Adaptive streaming of scalable stereoscopic video over DCCP
Cheng et al. Improving transmission quality of MPEG video stream by SCTP multi-streaming and differential RED mechanisms
Ganguly et al. A-REaLiSTIQ-ViBe: Entangling Encoding and Transport to Improve Live Video Experience
Porter et al. HYBRID TCP/UDP video transport for H. 264/AVC content delivery in burst loss networks
Surati et al. Evaluate the Performance of Video Transmission Using H. 264 (SVC) Over Long Term Evolution (LTE)
WO2025215016A1 (fr) Rafraîchissement d'image à codage intra-image avec réchantillonnage d'image de référence (rpr)
Osman et al. A comparative study of video coding standard performance via local area network
Amin Video QoS/QoE over IEEE802. 11n/ac: A Contemporary Survey
Rahman et al. Link Adaptation Based on Video Quality Monitoring for H. 264/AVC Video Transmission over WLANs

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: UNKNOWN

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20250528

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR