[go: up one dir, main page]

WO2025100925A1 - Method and apparatus of qos analytics for immersive media metadata - Google Patents

Method and apparatus of qos analytics for immersive media metadata Download PDF

Info

Publication number
WO2025100925A1
WO2025100925A1 PCT/KR2024/017376 KR2024017376W WO2025100925A1 WO 2025100925 A1 WO2025100925 A1 WO 2025100925A1 KR 2024017376 W KR2024017376 W KR 2024017376W WO 2025100925 A1 WO2025100925 A1 WO 2025100925A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
immersive
metadata
immersive media
description information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/KR2024/017376
Other languages
French (fr)
Inventor
Eric Yip
Jaeyeon Song
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of WO2025100925A1 publication Critical patent/WO2025100925A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic

Definitions

  • 5G network systems for multimedia, architectures and procedures for immersive media and metadata delivery over 5G, descriptions for immersive media metadata, immersive metadata message formats, immersive metadata data channel message format, immersive metadata including extended reality (XR), avatar, artificial intelligence (AI)/machine learning (ML) related immersive media services.
  • XR extended reality
  • AI artificial intelligence
  • ML machine learning
  • Fifth generation (5G) mobile communication technologies define broad frequency bands such that high transmission rates and new services are possible, and can be implemented not only in "Sub 6 gigahertz (GHz)” bands such as 3.5GHz, but also in “Above 6GHz” bands referred to as millimeter wave (mmWave) including 28GHz and 39GHz.
  • GHz sub 6 gigahertz
  • mmWave millimeter wave
  • 6G mobile communication technologies referred to as Beyond 5G systems
  • THz terahertz
  • V2X Vehicle-to-everything
  • NR-U New Radio Unlicensed
  • UE user equipment
  • NTN Non-Terrestrial Network
  • IIoT Industrial Internet of Things
  • IAB Integrated Access and Backhaul
  • DAPS Dual Active Protocol Stack
  • RACH random access channel
  • 5G baseline architecture for example, service based architecture or service based interface
  • NFV Network Functions Virtualization
  • SDN Software-Defined Networking
  • MEC Mobile Edge Computing
  • multi-antenna transmission technologies such as Full Dimensional MIMO (FD-MIMO), array antennas and large-scale antennas, metamaterial-based lenses and antennas for improving coverage of terahertz band signals, high-dimensional space multiplexing technology using Orbital Angular Momentum (OAM), and Reconfigurable Intelligent Surface (RIS), but also full-duplex technology for increasing frequency efficiency of 6G mobile communication technologies and improving system networks, AI-based communication technology for implementing system optimization by utilizing satellites and Artificial Intelligence (AI) from the design stage and internalizing end-to-end AI support functions, and next-generation distributed computing technology for implementing services at levels of complexity exceeding the limit of UE operation capability by utilizing ultra-high-performance communication and computing resources.
  • FD-MIMO Full Dimensional MIMO
  • OFAM Orbital Angular Momentum
  • RIS Reconfigurable Intelligent Surface
  • AI-based communication technology for implementing system optimization by utilizing satellites and Artificial Intelligence (AI) from the design stage and internalizing end-to-end AI support functions
  • Immersive multimedia is becoming the next potential killer-service for not only 5G advanced, but also 6G.
  • Such immersive media services include XR services (including MR and augmented reality (AR)), as well as real time communication immersive services such as avatar related calls or use cases.
  • XR services including MR and augmented reality (AR)
  • AR augmented reality
  • real time communication immersive services such as avatar related calls or use cases.
  • AI/ML is almost an essential part of next generation multimedia use cases.
  • Multimedia services in SA4 currently support various media components such as video, audio (& speech) and timed text.
  • the main benefit and advantage of providing media services via 5G is the ability to guarantee the quality of service (QoS), since mobile network operators (MNOs) are able to provide the specific data traffic needs for a defined service.
  • QoS quality of service
  • MNOs mobile network operators
  • the QoS of each of these components is clearly defined such that it can be used by the 5GS to provide an overall QoS (or several QoS's, depending on the service configuration) to for the service.
  • Metadata is typically non-timed for delivery purposes (e.g. does not need to be delivered in a synchronized manner with video/audio media data) but requires synchronization with video/audio media data during presentation (i.e. when the media is consumed and displayed to the user during the service).
  • Such metadata may be for example:
  • Avatar information/data more specifically avatar models which represent the face/head/body of a user's avatar, and skeleton information which represents the actual movement of the user at any given time, to be used to control the avatar
  • AI/ML data more specifically the information/data related to the AI model topology (or structure/architecture), and the data related to the weights or biases of the AI model, which may be updated separately during the service.
  • This disclosure contains mechanisms to describe the various types of immersive media metadata, such that the corresponding QoS can be assigned to the requirements of the metadata for the service.
  • the type of immersive media service determines the type of immersive metadata required, and the delivery characteristics differ depending on the required immersive metadata. For example:
  • a method performed by a terminal in a wireless communication system comprises transmitting, to an immersive media server, a first message for creating an immersive media session, receiving, from the immersive media server, a second message including description of immersive media output, establishing a transport connection for an immersive media service and receiving, from the immersive media server, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  • IM immersive metadata
  • a method performed by an immersive media server in a wireless communication system comprises receiving, from a terminal, a first message for creating an immersive media session, transmitting, to the terminal, a second message including description of immersive media output, establishing a transport connection for an immersive media service and transmitting, to the terminal, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and
  • IM immersive metadata
  • the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  • a terminal in a wireless communication system comprises a transceiver and a controller configured to transmit, to an immersive media server, a first message for creating an immersive media session, to receive, from the immersive media server, a second message including description of immersive media output, to establish a transport connection for an immersive media service, and to receive, from the immersive media server, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  • IM immersive metadata
  • an immersive media server in a wireless communication system comprises a transceiver and a controller configured to receive, from a terminal, a first message for creating an immersive media session, to transmit, to the terminal, a second message including description of immersive media output, to establish a transport connection for an immersive media service, and to transmit, to the terminal, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  • IM immersive metadata
  • Figure 1 illustrates the overall 5G Media Streaming Architecture
  • FIG. 2 illustrates the multiple components of multimedia, typically video, audio and timed text, all of which have its own specified QoS,
  • Figure 3A illustrates architecture for immersive media between a UE and a trusted DN, which identifies the various functional entities and interfaces for enabling immersive media services in this disclosure
  • Figure 3B illustrates a general high level call flow for immersive media
  • Figure 5 illustrates the immersive media service session setup for the case where immersive media metadata is delivered in the uplink direction
  • Figure 6 illustrates three different types of immersive metadata set traffic types.
  • Figure 7 illustrates an electronic device according to an embodiment of the present disclosure.
  • Figure 8 illustrates a node according to an embodiment of the present disclosure.
  • FIGS. 1 through 6, discussed below, and the various embodiments used to describe the principles of the present disclosure in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the disclosure. Those skilled in the art will understand that the principles of the present disclosure may be implemented in any suitably arranged system or device.
  • each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations can be implemented by computer program instructions.
  • These computer program instructions can be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks.
  • These computer program instructions may also be stored in a computer usable or computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer usable or computer-readable memory produce an article of manufacture including instruction means that implement the function specified in the flowchart block or blocks.
  • the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
  • each block of the flowchart illustrations may represent a module, segment, or portion of code, which includes one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the blocks may occur out of the order. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • the "unit” refers to a software element or a hardware element, such as a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC), which performs a predetermined function.
  • FPGA Field Programmable Gate Array
  • ASIC Application Specific Integrated Circuit
  • the "unit” does not always have a meaning limited to software or hardware.
  • the “unit” may be constructed either to be stored in an addressable storage medium or to execute one or more processors. Therefore, the "unit” includes, for example, software elements, object-oriented software elements, class elements or task elements, processes, functions, properties, procedures, sub-routines, segments of a program code, drivers, firmware, micro-codes, circuits, data, database, data structures, tables, arrays, and parameters.
  • the elements and functions provided by the "unit” may be either combined into a smaller number of elements, or a “unit,” or divided into a larger number of elements, or a “unit.” Moreover, the elements and “units” or may be implemented to reproduce one or more CPUs within a device or a security multimedia card. Furthermore, the "unit” in the embodiments may include one or more processors.
  • a base station is an entity that allocates resources to a UE and may be at least one of a node B, a base station (BS), an eNode B (eNB), a gNode B (gNB), a radio access unit, a base station controller, and a node on a network.
  • the UE may include a user equipment (UE), a mobile station (MS), a cellular phone, a smartphone, a computer, or a multimedia system capable of performing a communication function.
  • embodiments of the disclosure may be applied to other communication systems having a similar technical background or channel type to that of the embodiments of the disclosure.
  • the embodiments of the disclosure may also be applied to other communication systems through some modifications without significantly departing from the range of the disclosure based on determination of those skilled in the technical knowledge.
  • a 5th generation mobile communication technology (5G, new radio (NR)) developed after LTE-A may be included herein, and 5G may be a concept embracing the existing LTE and LTE-A and similar other services.
  • 5G may also be applied to other communication systems through some modifications without significantly departing from the range of the disclosure based on determination of those skilled in the technical knowledge.
  • 3GPP 3rd generation partnership project
  • LTE long term evolution
  • NR 3GPP new radio
  • Figure 1 shows the overall 5G Media Streaming Architecture in TS 26.501, representing the specified 5GMS functions within the 5GS as defined TS 23.501.
  • 5GMS functions includes UE, RAN, UPF, Trusted DN, External DN, 5GMS, 5G system, External, NEF, and PCF.
  • FIG. 2 shows the multiple components of multimedia, typically video, audio and timed text, all of which have its own specified QoS.
  • a grouping of these different immersive media types can be defined, and as such the required QoS for each of these media streams are requested and provisioned by the 5G core (5GC).
  • 5GC 5G core
  • a combination of these multiple media streams are given a combined QoS, such as when multiple media streams are multiplexed into a single PDU session, or bundled together similarly.
  • This multiplexing of multiple media streams into a multiplexed stream in a transport connection is one of the QoS handling enhancements considered for network enhancements of the 5GC for extended reality and media (XRM).
  • XRM extended reality and media
  • a new generic media stream type for immersive metadata is defined, where the embodiments of this disclosure allow for a similar QoS to be specified for a specific immersive metadata stream required for the immersive service, and thus requested and provisioned by the 5GC.
  • Figure 3A shows architecture for immersive media between a UE and a trusted DN, which identifies the various functional entities and interfaces for enabling immersive media services in this disclosure.
  • Immersive media AF An Application Function similar to that defined in TS 23.501 clause 6.2.10, dedicated to immersive media. Typically provides various control functions to the Media Data Session Handler on the UE and/or to the Application Provider. It may interact with other 5GC network functions, such as a data collection proxy (DCP) function entity.
  • DCP data collection proxy
  • the DCP may or may not include NWDAF function/functionality.
  • Immersive media AS An application server (AS) dedicated to immersive media services.
  • the immersive media AS typically supports immersive media hosting by ingesting immersive contents from an Application Provider, and egesting models to other network functions for media processing; it may also contain sub-functions which perform media processing for immersive media.
  • An example of these immersive media processes may include: split/network rendering using pose information, avatar model processing or rendering using avatar metadata, and/or AI model inferencing in the network (either full inferencing or partial).
  • Application Provider External application, with content-specific media functionality, and/or immersive media specific functionality (e.g. avatar creation, AI model creation, splitting, updating etc.).
  • immersive media specific functionality e.g. avatar creation, AI model creation, splitting, updating etc.
  • Media Session Handler a function on the UE that communications with the immersive media AF in order to establish, control and support the delivery of an immersive media session, and may perform additional functions such as consumption and QoE metrics collection and reporting.
  • the Session Handler may expose APIs that can be used by the immersive media Aware Application. It may contain logical subfunctions for immersive media such as a metadata capability manager, which may include an AI Capability Manager, which handles the negotiation and handling of metadata and capability related data and decision internally in the UE, and also between the UE and network.
  • Immersive media client a function on the UE that communicates with the AI AS in order to download/stream (or even upload) the immersive media, and also the immersive media metadata, and may provide APIs to the immersive media Aware Application for metadata processing, and to the Media Session Handler for media/metadata session control in the UE.
  • the description messages describing the mentioned immersive metadata are sent via the control plane between the AF and the media session handler (RTC-5), and the actual metadata is sent via the user plane between the immersive media server and the immersive media client (RTC-4).
  • Figure 3B shows a general high level call flow for immersive media.
  • the Application Provider requests and sets up the edge server(s) used for the immersive media service where needed, in particular to processed immersive metadata, as described in TS 26.506 clauses 6.1 or 6.2.
  • the Application provider may use any other method to allocation edge servers, or leave it to the MNO to set up appropriate edge servers to run the processing of immersive metadata.
  • the Application Provider provisions the immersive media service session using RTC-1 and RTC-5. If the edge servers were provisioned in step 1, the edge servers' ids are provided in this session to employ them for immersive media processing.
  • the immersive media service session is set up according to figures 4 and 5.
  • Figure 4 shows the immersive media service session setup for the case where immersive media metadata is delivered in the downlink direction.
  • the steps are:
  • the Metadata Manager (e.g. typically a Presentation Engine or similar) discovers the immersive media server and sets up a connection to it. It provides information about its immersive processing and rendering capabilities and the possible XR runtime configuration, e.g the OpenXR configuration may be used for this purpose.
  • the immersive media server creates a description of the immersive media output and the input it expects to receive from the UE.
  • the Metadata Manager requests the buffer streams from the MAF, which in turn establishes a connection to the immersive media server to deliver or stream pose the required immersive metadata for the immersive media service.
  • the Immersive Media Server transmits the immersive metadata to the Media Access Function.
  • the immersive metadata is passed to the Metadata Manager.
  • the immersive media data may also be transmitted to the Media Access Function in addition to the immersive metadata in step 4.
  • the immersive metadata received is then processed by the UE, between the Metadata Manager and the XR Source Manager.
  • the processed media data (using the metadata) or raw buffer frames are passed to the XR Runtime for rendering.
  • the immersive media is composed and rendered.
  • Figure 5 shows the immersive media service session setup for the case where immersive media metadata is delivered in the uplink direction.
  • the steps are:
  • the Metadata Manager (e.g. typically a Presentation Engine or similar) discovers the immersive media server and sets up a connection to it. It provides information about its immersive processing and rendering capabilities and the possible XR runtime configuration, e.g the OpenXR configuration may be used for this purpose.
  • the immersive media server creates a description of the immersive media output and the input it expects to receive from the UE.
  • the Metadata Manager requests the buffer streams from the MAF, which in turn establishes a connection to the immersive media server to deliver or stream pose the required immersive metadata for the immersive media service.
  • the Source Manager retrieves immersive metadata from the XR runtime.
  • the Source Manager shares the immersive metadata with the immersive media server.
  • the immersive media server uses the metadata to perform processing related to the immersive media.
  • the Media Access function decodes and processes the buffer from or process media data.
  • the raw buffer frames or processed media data is sent to the Metadata Consumer and XR runtime.
  • the Immersive Media is composed and rendered.
  • Figure 6 is an embodiment of this disclosure showing three different types of immersive metadata set traffic types.
  • An Immersive Metadata set is defined as a series of metadata units which are delivered together with the same timestamp, as a single set unit.
  • This disclosure in particular identifies the following immersive metadata:
  • This type is typically used for small, continuous data, where the frequency of the metadata also typically has a high Hz (such as user avatar animation data, or user location/pose information data
  • This type is typically used for metadata which contains both large burst data, followed by small, continuous data, in the same IM set (such as an IM set containing first a user representation data, following by multiple user animation data)
  • This type is typically used for metadata which contains only large burst data, with no other metadata contained for delivery in the IM set between burst data.
  • an IM set contains RTP (real-time transport protocol) packets of such non-timed immersive metadata, delivered via either WebRTC or MTSI/IMS.
  • RTP real-time transport protocol
  • Each metadata type has it respective encapsulation format
  • - Immersive media metadata shall use WebRTC for the real-time transport of the non-timed media metadata.
  • the RTP restrictions for WebRTC as specified in RFC8834 shall apply.
  • Metadata may be carried via SRTP, or alternatively via WebRTC data channel; the usage of the WebRTC data channel shall be in accordance with RFC8831.
  • packet header extensions are defined, such as RTP header extensions if transport protocol based on RTP is used.
  • RTP header extensions if transport protocol based on RTP is used.
  • the syntax and semantics of these extensions are defined below:
  • Immersive metadata sets correspond to one or PDUs containing immersive metadata
  • This field is a flag that shall be set to 1 for the last PDU of the IM Set and set to 0 for all other PDUs of the IMS Set.
  • EDB End of Data Burst
  • the EDB field is 3 bits in length and indicates the end of a Data Burst.
  • the 3 bits encode the End of Data Burst indication as per the encoding and guidelines provided in Clause 4.4.2.6.1.
  • the IM Set traffic type indicates the traffic characteristic of the immersive metadata in the IM set.
  • IBS IM Burst size
  • - IM size [IS] (24 bits): Indicates the maximum size of a unit of immersive metadata when ISTT is either continuous or burst persistent .
  • the field encodes the sequence number of the IM Set to which the current PDU belongs acting as a 10-bit numerical identifier for the PDU Set.
  • PSN PDU Sequence Number within an IM Set [PSN] (6 bits): The sequence number of the current PDU within the PDU Set.
  • the PSN shall be set to 0 for the first PDU in the PDU Set and incremented monotonically for every PDU in the PDU set in order of transmission from the sender.
  • a receiver may use the RTP packet sequence number together with the PSN to distinguish between PDUs within a IM Set that contains more than 64 PDUs.
  • the IM Set Size indicates the total size of all PDUs of the IM Set to which this PDU belongs. This field is optional and subject to an SDP signaling offer/answer negotiation, where the Application Server may indicate whether it will be able to provide the size of the IM Set for that RTP stream. If not enabled, the field should not be present. If enabled, but the Application Server is not able to determine the IM Set Size for a particular PDU Set, it should set the value to 0 in all PDUs of that IM Set.
  • the PSSize shall indicate the size of a PDU Set including RTP/UDP/IP header encapsulation overhead of its corresponding PDUs. The PSSize is expressed in bytes.
  • an immersive metadata message containing such non-timed immersive media metadata is delivered between the server and UE, via WebRTC data channel.
  • the usage of metadata data channel messages to carry actual metadata is typically limited to those with traffic characteristics of small, continuous data, i.e. of type continuous in figure 6.
  • the IM message content depends on the type of the IM message.
  • the data channel sub-protocol is defined in seperately
  • - IM message types shall be unique identifiers in the URN format. This embodiment defines a set of IM types and their formats.
  • the - Immersive media shall use WebRTC for the real-time transport of the immersive metadata message.
  • the data channel sub-protocol shall be identified as "3gpp-im-message", which shall be included in the dcmap attribute of the SDP.
  • the transmission order for the data channel shall be set to in-order and the transmission reliability shall be set to reliable.
  • the immersive metadata message format shall be set to text-based and the messages shall be UTF-8 encoded JSON messages.
  • a data channel message may carry one or more immersive metadata messages as defined below.
  • Message type payload formats (typically small, continuous data):
  • the immersive media client on the XR device periodically transmits a set of pose predictions to the immersive media server.
  • the type of the message shall be set to "urn:3gpp:im:v1:pose”.
  • Each predicted pose shall contain the associated predicted display time and an identifier of the XR space that was used for that pose.
  • the payload of the message shall be as follows (Pose Prediction Format):
  • Similar payload formats can be defined for types of "avatar” and “AI/ML” immersive metadata data channel messages.
  • Avatar animation format (example)
  • AI/ML AI model weight update format (example)
  • an immersive metadata configuration message containing descriptions of such non-timed immersive media metadata is delivered between the server and UE, either directly via WebRTC data channel, or via the SWAP protocol using a SWAP server between the immersive media client and the immersive media server.
  • the app-specific message on IM configuration may be parsed and connected with SDP offers and answers via related SDP parameters to describe the same. More specifically, such immersive metadata configuration messages describe the same related to the IM set as defined in embodiment 1
  • This field is a flag that shall be set to 1 for the last PDU of the IM Set and set to 0 for all other PDUs of the IMS Set.
  • EDB End of Data Burst
  • the EDB field is 3 bits in length and indicates the end of a Data Burst.
  • the 3 bits encode the End of Data Burst indication as per the encoding and guidelines provided in Clause 4.4.2.6.1.
  • the IM Set traffic type indicates the traffic characteristic of the immersive metadata in the IM set.
  • the traffic type can be set to continuous, burst persistent, or burst interval.
  • IBS IM Burst size
  • - IM size [IS] (24 bits): Indicates the maximum size of a unit of immersive metadata when ISTT is either continuous or burst persistent .
  • the field encodes the sequence number of the IM Set to which the current PDU belongs acting as a 10-bit numerical identifier for the PDU Set.
  • PSN PDU Sequence Number within an IM Set [PSN] (6 bits): The sequence number of the current PDU within the PDU Set.
  • the PSN shall be set to 0 for the first PDU in the PDU Set and incremented monotonically for every PDU in the PDU set in order of transmission from the sender.
  • a receiver may use the RTP packet sequence number together with the PSN to distinguish between PDUs within a IM Set that contains more than 64 PDUs.
  • the IM Set Size indicates the total size of all PDUs of the IM Set to which this PDU belongs. This field is optional and subject to an SDP signaling offer/answer negotiation, where the Application Server may indicate whether it will be able to provide the size of the IM Set for that RTP stream. If not enabled, the field should not be present. If enabled, but the Application Server is not able to determine the IM Set Size for a particular PDU Set, it should set the value to 0 in all PDUs of that IM Set.
  • the PSSize shall indicate the size of a PDU Set including RTP/UDP/IP header encapsulation overhead of its corresponding PDUs. The PSSize is expressed in bytes.
  • This section describes the process of sending immersive media metadata which is pose information, for services requiring such metadata e.g. XR, AR, MR immersive media services.
  • Step 3 in figure 3B for session setup is then progressed.
  • step 3 in figure 3B is progressed as shown in the procedure defined by figure 5.
  • the immersive metadata configuration message as described in embodiment 3 is exchanged between the UE and server, in step 3 of figure 5.
  • the parameters inside this configuration message enables the required QoS to be assigned for the metadata to be transported.
  • the IM Set Traffic Type [ISTT] is typically set to continuous.
  • Metadata (pose information) is delivered from the UE to the server in step 5 of figure 5 in two different possible manners:
  • immersive metadata messages which contain the actual metadata, with the message format type for pose information, as specified in embodiment 2.
  • This section describes the process of sending immersive media metadata which is avatar immersive metadata, for services requiring such metadata e.g. avatar calls, or games with avatars.
  • Step 3 in figure 3B for session setup is then progressed.
  • Avatar immersive metadata can typically be sent in both the uplink and downlink directions, step 3 in figure 3B is progressed as shown in the procedure defined by either figure 4 or figure 5.
  • the immersive metadata configuration message as described in embodiment 3 is exchanged between the UE and server, in step 3 of figure 5.
  • the parameters inside this configuration message enables the required QoS to be assigned for the metadata to be transported.
  • the IM Set Traffic Type [ISTT] is typically set to burst persistent when both the avatar model and skeleton is sent, or to continuous when only the skeleton information is sent.
  • Metadata is delivered from the server to the UE in step 4 or figure 4, or from the UE to the server in step 5 of figure 5, in two different possible manners:
  • immersive metadata messages which contain the actual metadata, with the message format type for pose information, as specified in embodiment 2.
  • This section describes the process of sending immersive media metadata which is AI/ML immersive metadata, for services requiring such metadata e.g. AI/ML processing of media for upscaling, vision applications etc.
  • Step 3 in figure 3B for session setup is then progressed.
  • AI/ML immersive metadata can typically be sent in both the uplink and downlink directions, step 3 in figure 3B is progressed as shown in the procedure defined by either figure 4 or figure 5.
  • the immersive metadata configuration message as described in embodiment 3 is exchanged between the UE and server, in step 3 of figure 5.
  • the parameters inside this configuration message enables the required QoS to be assigned for the metadata to be transported.
  • the IM Set Traffic Type [ISTT] is typically set to burst persistent when both the AI model topology/architecture and updates of biases and weights are constantly sent, or to burst interval when only the AI model topology is sent periodically (i.e. updated).
  • Metadata (AI/ML information) is delivered from the server to the UE in step 4 or figure 4, or from the UE to the server in step 5 of figure 5, in two different possible manners:
  • immersive metadata messages which contain the actual metadata, with the message format type for pose information, as specified in embodiment 2.
  • FIG. 7 illustrates an electronic device according to an embodiment of the present disclosure.
  • the electronic device 700 may include a processor 710, a transceiver 720 and a memory 730. However, all of the illustrated components are not essential. The electronic device 700 may be implemented by more or less components than those illustrated in FIG. 7. In addition, the processor 710 and the transceiver 720 and the memory 730 may be implemented as a single chip according to another embodiment.
  • the processor 710 may include one or more processors or other processing devices that control the provided function, process, and/or method. Operation of the electronic device 700 may be implemented by the processor 710.
  • the transceiver 720 may include a RF transmitter for up-converting and amplifying a transmitted signal, and a RF receiver for down-converting a frequency of a received signal.
  • the transceiver 720 may be implemented by more or less components than those illustrated in components.
  • the transceiver 720 may be connected to the processor 710 and transmit and/or receive a signal.
  • the signal may include control information and data.
  • the transceiver 720 may receive the signal through a wireless channel and output the signal to the processor 710.
  • the transceiver 720 may transmit a signal output from the processor 710 through the wireless channel.
  • FIG. 8 illustrates a node according to an embodiment of the present disclosure.
  • the node 800 may include a processor 810, a transceiver 820 and a memory 830. However, all of the illustrated components are not essential. The node 800 may be implemented by more or less components than those illustrated in FIG. 8. In addition, the processor 810 and the transceiver 820 and the memory 830 may be implemented as a single chip according to another embodiment.
  • the node 800 may correspond to the base station, the server and/or the immersive media server described above.
  • the transceiver 820 may include a RF transmitter for up-converting and amplifying a transmitted signal, and a RF receiver for down-converting a frequency of a received signal.
  • the transceiver 820 may be implemented by more or less components than those illustrated in components.
  • the memory 830 may store the control information or the data included in a signal obtained by the node 800.
  • the memory 830 may be connected to the processor 810 and store at least one instruction or a protocol or a parameter for the provided function, process, and/or method.
  • the memory 830 may include read-only memory (ROM) and/or random access memory (RAM) and/or hard disk and/or CD-ROM and/or DVD and/or other storage devices.
  • a computer-readable storage medium may be provided to store one or more programs (software modules).
  • the one or more programs stored in the computer-readable storage medium may be configured for execution by one or more processors in an electronic device.
  • the one or more programs may include instructions for causing the electronic device to execute the methods according to the embodiments of the disclosure described in the specification or the claims.
  • the programs may be stored in an attachable storage device that may be accessed through a communication network such as Internet, Intranet, local area network (LAN), wide LAN (WLAN), or storage area network (SAN), or through a communication network constituted by any combination thereof.
  • a storage device may be connected through an external port to an apparatus performing an embodiment of the disclosure.
  • a separate storage device on a communication network may be connected to an apparatus performing an embodiment of the disclosure.
  • the components included in the disclosure are expressed in the singular or plural according to the presented particular embodiments.
  • the singular or plural expressions are selected suitably according to the presented situations for convenience of description, the disclosure is not limited to the singular or plural components, and the components expressed in the plural may even be constituted in the singular or the components expressed in the singular may even be constituted in the plural.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The disclosure relates to a 5G or 6G communication system for supporting a higher data transmission rate. The disclosure relates to a method performed by a terminal in a wireless communication system, the method comprises transmitting, to an immersive media server, a first message for creating an immersive media session, receiving, from the immersive media server, a second message including description of immersive media output, establishing a transport connection for an immersive media service and receiving, from the immersive media server, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.

Description

METHOD AND APPARATUS OF QOS ANALYTICS FOR IMMERSIVE MEDIA METADATA
5G network systems for multimedia, architectures and procedures for immersive media and metadata delivery over 5G, descriptions for immersive media metadata, immersive metadata message formats, immersive metadata data channel message format, immersive metadata including extended reality (XR), avatar, artificial intelligence (AI)/machine learning (ML) related immersive media services.
Fifth generation (5G) mobile communication technologies define broad frequency bands such that high transmission rates and new services are possible, and can be implemented not only in "Sub 6 gigahertz (GHz)" bands such as 3.5GHz, but also in "Above 6GHz" bands referred to as millimeter wave (mmWave) including 28GHz and 39GHz. In addition, it has been considered to implement sixth generation (6G) mobile communication technologies (referred to as Beyond 5G systems) in terahertz (THz) bands (for example, 95GHz to 3THz bands) in order to accomplish transmission rates fifty times faster than 5G mobile communication technologies and ultra-low latencies one-tenth of 5G mobile communication technologies.
At the beginning of the development of 5G mobile communication technologies, in order to support services and to satisfy performance requirements in connection with enhanced Mobile BroadBand (eMBB), Ultra Reliable Low Latency Communications (URLLC), and massive Machine-Type Communications (mMTC), there has been ongoing standardization regarding beamforming and massive multi input multi output (MIMO) for mitigating radio-wave path loss and increasing radio-wave transmission distances in mmWave, supporting numerologies (for example, operating multiple subcarrier spacings) for efficiently utilizing mmWave resources and dynamic operation of slot formats, initial access technologies for supporting multi-beam transmission and broadbands, definition and operation of BandWidth Part (BWP), new channel coding methods such as a Low Density Parity Check (LDPC) code for large amount of data transmission and a polar code for highly reliable transmission of control information, L2 pre-processing, and network slicing for providing a dedicated network specialized to a specific service.
Currently, there are ongoing discussions regarding improvement and performance enhancement of initial 5G mobile communication technologies in view of services to be supported by 5G mobile communication technologies, and there has been physical layer standardization regarding technologies such as Vehicle-to-everything (V2X) for aiding driving determination by autonomous vehicles based on information regarding positions and states of vehicles transmitted by the vehicles and for enhancing user convenience, New Radio Unlicensed (NR-U) aimed at system operations conforming to various regulation-related requirements in unlicensed bands, new radio (NR) user equipment (UE) Power Saving, Non-Terrestrial Network (NTN) which is UE-satellite direct communication for providing coverage in an area in which communication with terrestrial networks is unavailable, and positioning.
Moreover, there has been ongoing standardization in air interface architecture/protocol regarding technologies such as Industrial Internet of Things (IIoT) for supporting new services through interworking and convergence with other industries, Integrated Access and Backhaul (IAB) for providing a node for network service area expansion by supporting a wireless backhaul link and an access link in an integrated manner, mobility enhancement including conditional handover and Dual Active Protocol Stack (DAPS) handover, and two-step random access for simplifying random access procedures (2-step random access channel (RACH) for NR). There also has been ongoing standardization in system architecture/service regarding a 5G baseline architecture (for example, service based architecture or service based interface) for combining Network Functions Virtualization (NFV) and Software-Defined Networking (SDN) technologies, and Mobile Edge Computing (MEC) for receiving services based on UE positions.
As 5G mobile communication systems are commercialized, connected devices that have been exponentially increasing will be connected to communication networks, and it is accordingly expected that enhanced functions and performances of 5G mobile communication systems and integrated operations of connected devices will be necessary. To this end, new research is scheduled in connection with eXtended Reality (XR) for efficiently supporting Augmented Reality (AR), Virtual Reality (VR), Mixed Reality (MR) and the like, 5G performance improvement and complexity reduction by utilizing Artificial Intelligence (AI) and Machine Learning (ML), AI service support, metaverse service support, and drone communication.
Furthermore, such development of 5G mobile communication systems will serve as a basis for developing not only new waveforms for providing coverage in terahertz bands of 6G mobile communication technologies, multi-antenna transmission technologies such as Full Dimensional MIMO (FD-MIMO), array antennas and large-scale antennas, metamaterial-based lenses and antennas for improving coverage of terahertz band signals, high-dimensional space multiplexing technology using Orbital Angular Momentum (OAM), and Reconfigurable Intelligent Surface (RIS), but also full-duplex technology for increasing frequency efficiency of 6G mobile communication technologies and improving system networks, AI-based communication technology for implementing system optimization by utilizing satellites and Artificial Intelligence (AI) from the design stage and internalizing end-to-end AI support functions, and next-generation distributed computing technology for implementing services at levels of complexity exceeding the limit of UE operation capability by utilizing ultra-high-performance communication and computing resources.
Immersive multimedia is becoming the next potential killer-service for not only 5G advanced, but also 6G. Such immersive media services include XR services (including MR and augmented reality (AR)), as well as real time communication immersive services such as avatar related calls or use cases. Also included in this trend is the use of AI/ML with such immersive services, since AI/ML is almost an essential part of next generation multimedia use cases.
Multimedia services in SA4 currently support various media components such as video, audio (& speech) and timed text.
The main benefit and advantage of providing media services via 5G is the ability to guarantee the quality of service (QoS), since mobile network operators (MNOs) are able to provide the specific data traffic needs for a defined service. As such, for the delivery of such components for various media services over 5G, the QoS of each of these components is clearly defined such that it can be used by the 5GS to provide an overall QoS (or several QoS's, depending on the service configuration) to for the service.
New immersive media services, however, require additional metadata which is dependent on the type of service, as such, immersive media contains multiple new types of metadata. Such metadata is typically non-timed for delivery purposes (e.g. does not need to be delivered in a synchronized manner with video/audio media data) but requires synchronization with video/audio media data during presentation (i.e. when the media is consumed and displayed to the user during the service).
Such metadata may be for example:
- Pose information/data - metadata which described the location and orientation of a device or user, for XR services
- Avatar information/data - more specifically avatar models which represent the face/head/body of a user's avatar, and skeleton information which represents the actual movement of the user at any given time, to be used to control the avatar
- AI/ML data - more specifically the information/data related to the AI model topology (or structure/architecture), and the data related to the weights or biases of the AI model, which may be updated separately during the service.
This disclosure contains mechanisms to describe the various types of immersive media metadata, such that the corresponding QoS can be assigned to the requirements of the metadata for the service.
Typically the type of immersive media service determines the type of immersive metadata required, and the delivery characteristics differ depending on the required immersive metadata. For example:
- Avatar metadata:
- User representation data (mesh, point cloud, etc) -> large, burst data
- User animation data (skeleton data) -> small, continuous data
- Pose information metadata:
- User location and pose information data (quaternion etc) -> small, continuous data
- AI model data:
- AI model topology/structure/architecture -> large, burst data
- AI model weight factors etc -> small, bursty/continuous data
As such this disclosure describes a series of embodiments which allow such metadata to be expressed in a generic manner, through descriptions which can expose the data characteristics and QoS requirements of the metadata.
According to the disclosure, a method performed by a terminal in a wireless communication system is provided. The method comprises transmitting, to an immersive media server, a first message for creating an immersive media session, receiving, from the immersive media server, a second message including description of immersive media output, establishing a transport connection for an immersive media service and receiving, from the immersive media server, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
Also according to the disclosure, a method performed by an immersive media server in a wireless communication system is provided. The method comprises receiving, from a terminal, a first message for creating an immersive media session, transmitting, to the terminal, a second message including description of immersive media output, establishing a transport connection for an immersive media service and transmitting, to the terminal, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and
wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
Also according to the disclosure, a terminal in a wireless communication system is provided. The terminal comprises a transceiver and a controller configured to transmit, to an immersive media server, a first message for creating an immersive media session, to receive, from the immersive media server, a second message including description of immersive media output, to establish a transport connection for an immersive media service, and to receive, from the immersive media server, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
Also according to the disclosure, an immersive media server in a wireless communication system is provided. The immersive media server comprises a transceiver and a controller configured to receive, from a terminal, a first message for creating an immersive media session, to transmit, to the terminal, a second message including description of immersive media output, to establish a transport connection for an immersive media service, and to transmit, to the terminal, a third message for an immersive metadata (IM) set, wherein the third message include description information for the IM set, and wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
According to embodiments of the disclosure, method and apparatus of QoS analytics for immersive media metadata are provided.
The following is enabled according to the disclosure:
- Managing QoS and data traffic characteristics of immersive metadata
- The sending of small, continuous data type metadata through messages, which message formats defined in this disclosure
Figure 1 illustrates the overall 5G Media Streaming Architecture,
Figure 2 illustrates the multiple components of multimedia, typically video, audio and timed text, all of which have its own specified QoS,
Figure 3A illustrates architecture for immersive media between a UE and a trusted DN, which identifies the various functional entities and interfaces for enabling immersive media services in this disclosure,
Figure 3B illustrates a general high level call flow for immersive media,
Figure 4 illustrates the immersive media service session setup for the case where immersive media metadata is delivered in the downlink direction,
Figure 5 illustrates the immersive media service session setup for the case where immersive media metadata is delivered in the uplink direction, and
Figure 6 illustrates three different types of immersive metadata set traffic types.
Figure 7 illustrates an electronic device according to an embodiment of the present disclosure.
Figure 8 illustrates a node according to an embodiment of the present disclosure.
FIGS. 1 through 6, discussed below, and the various embodiments used to describe the principles of the present disclosure in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the disclosure. Those skilled in the art will understand that the principles of the present disclosure may be implemented in any suitably arranged system or device.
Hereinafter, embodiments of the disclosure will be described in detail with reference to the accompanying drawings.
In describing the embodiments, descriptions related to technical contents well-known in the art and not associated directly with the disclosure will be omitted. Such an omission of unnecessary descriptions is intended to prevent obscuring of the main idea of the disclosure and more clearly transfer the main idea.
For the same reason, in the accompanying drawings, some elements may be exaggerated, omitted, or schematically illustrated. Further, the size of each element does not completely reflect the actual size. In the drawings, identical or corresponding elements are provided with identical reference numerals.
The advantages and features of the disclosure and ways to achieve them will be apparent by making reference to embodiments as described below in detail in conjunction with the accompanying drawings. However, the disclosure is not limited to the embodiments set forth below, but may be implemented in various different forms. The following embodiments are provided only to completely disclose the disclosure and inform those skilled in the art of the scope of the disclosure, and the disclosure is defined only by the scope of the appended claims. Throughout the specification, the same or like reference numerals designate the same or like elements. Furthermore, in describing the disclosure, a detailed description of known functions or configurations incorporated herein will be omitted when it is determined that the description may make the subject matter of the disclosure unnecessarily unclear. The terms which will be described below are terms defined in consideration of the functions in the disclosure, and may be different according to users, intentions of the users, or customs. Therefore, the definitions of the terms should be made based on the contents throughout the specification.
Herein, it will be understood that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks. These computer program instructions may also be stored in a computer usable or computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer usable or computer-readable memory produce an article of manufacture including instruction means that implement the function specified in the flowchart block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
Furthermore, each block of the flowchart illustrations may represent a module, segment, or portion of code, which includes one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the blocks may occur out of the order. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
As used in embodiments of the disclosure, the "unit" refers to a software element or a hardware element, such as a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC), which performs a predetermined function. However, the "unit" does not always have a meaning limited to software or hardware. The "unit" may be constructed either to be stored in an addressable storage medium or to execute one or more processors. Therefore, the "unit" includes, for example, software elements, object-oriented software elements, class elements or task elements, processes, functions, properties, procedures, sub-routines, segments of a program code, drivers, firmware, micro-codes, circuits, data, database, data structures, tables, arrays, and parameters. The elements and functions provided by the "unit" may be either combined into a smaller number of elements, or a "unit," or divided into a larger number of elements, or a "unit." Moreover, the elements and "units" or may be implemented to reproduce one or more CPUs within a device or a security multimedia card. Furthermore, the "unit" in the embodiments may include one or more processors.
Hereinafter, a base station is an entity that allocates resources to a UE and may be at least one of a node B, a base station (BS), an eNode B (eNB), a gNode B (gNB), a radio access unit, a base station controller, and a node on a network. The UE may include a user equipment (UE), a mobile station (MS), a cellular phone, a smartphone, a computer, or a multimedia system capable of performing a communication function. Also, embodiments of the disclosure may be applied to other communication systems having a similar technical background or channel type to that of the embodiments of the disclosure. Also, the embodiments of the disclosure may also be applied to other communication systems through some modifications without significantly departing from the range of the disclosure based on determination of those skilled in the technical knowledge. For example, a 5th generation mobile communication technology (5G, new radio (NR)) developed after LTE-A may be included herein, and 5G may be a concept embracing the existing LTE and LTE-A and similar other services. Further, the disclosure may also be applied to other communication systems through some modifications without significantly departing from the range of the disclosure based on determination of those skilled in the technical knowledge.
Hereinafter, terms indicating an access node, terms indicating network entities or network functions (NFs), terms indicating messages, terms indicating an interface between network entities, terms indicating various pieces of identification information, and the like as used in the following description, are exemplified for convenience of explanation. Accordingly, the disclosure is not limited to the terms to be described later, but other terms indicating objects having equal technical meanings may be used.
Hereinafter, for convenience of explanation, some terms and names defined in 3rd generation partnership project (3GPP) long term evolution (LTE) standards and/or 3GPP new radio (NR) standards may be used. However, the disclosure is not limited to the above terms and names, and may also be applied to systems following other standards.
Figure 1 shows the overall 5G Media Streaming Architecture in TS 26.501, representing the specified 5GMS functions within the 5GS as defined TS 23.501.
5GMS functions includes UE, RAN, UPF, Trusted DN, External DN, 5GMS, 5G system, External, NEF, and PCF.
Figure 2 shows the multiple components of multimedia, typically video, audio and timed text, all of which have its own specified QoS. A grouping of these different immersive media types can be defined, and as such the required QoS for each of these media streams are requested and provisioned by the 5G core (5GC). In some services, a combination of these multiple media streams are given a combined QoS, such as when multiple media streams are multiplexed into a single PDU session, or bundled together similarly. This multiplexing of multiple media streams into a multiplexed stream in a transport connection is one of the QoS handling enhancements considered for network enhancements of the 5GC for extended reality and media (XRM). As part of this disclosure, a new generic media stream type for immersive metadata is defined, where the embodiments of this disclosure allow for a similar QoS to be specified for a specific immersive metadata stream required for the immersive service, and thus requested and provisioned by the 5GC.
Figure 3A shows architecture for immersive media between a UE and a trusted DN, which identifies the various functional entities and interfaces for enabling immersive media services in this disclosure.
Immersive media AF: An Application Function similar to that defined in TS 23.501 clause 6.2.10, dedicated to immersive media. Typically provides various control functions to the Media Data Session Handler on the UE and/or to the Application Provider. It may interact with other 5GC network functions, such as a data collection proxy (DCP) function entity. The DCP may or may not include NWDAF function/functionality.
Immersive media AS: An application server (AS) dedicated to immersive media services. The immersive media AS typically supports immersive media hosting by ingesting immersive contents from an Application Provider, and egesting models to other network functions for media processing; it may also contain sub-functions which perform media processing for immersive media. An example of these immersive media processes may include: split/network rendering using pose information, avatar model processing or rendering using avatar metadata, and/or AI model inferencing in the network (either full inferencing or partial).
Application Provider: External application, with content-specific media functionality, and/or immersive media specific functionality (e.g. avatar creation, AI model creation, splitting, updating etc.).
In the UE:
Media Session Handler: a function on the UE that communications with the immersive media AF in order to establish, control and support the delivery of an immersive media session, and may perform additional functions such as consumption and QoE metrics collection and reporting. The Session Handler may expose APIs that can be used by the immersive media Aware Application. It may contain logical subfunctions for immersive media such as a metadata capability manager, which may include an AI Capability Manager, which handles the negotiation and handling of metadata and capability related data and decision internally in the UE, and also between the UE and network.
Immersive media client: a function on the UE that communicates with the AI AS in order to download/stream (or even upload) the immersive media, and also the immersive media metadata, and may provide APIs to the immersive media Aware Application for metadata processing, and to the Media Session Handler for media/metadata session control in the UE.
In a simple embodiment of this disclosure, the description messages describing the mentioned immersive metadata are sent via the control plane between the AF and the media session handler (RTC-5), and the actual metadata is sent via the user plane between the immersive media server and the immersive media client (RTC-4).
Figure 3B shows a general high level call flow for immersive media.
Steps:
1. In this optional step, the Application Provider requests and sets up the edge server(s) used for the immersive media service where needed, in particular to processed immersive metadata, as described in TS 26.506 clauses 6.1 or 6.2. The Application provider may use any other method to allocation edge servers, or leave it to the MNO to set up appropriate edge servers to run the processing of immersive metadata.
2. The Application Provider provisions the immersive media service session using RTC-1 and RTC-5. If the edge servers were provisioned in step 1, the edge servers' ids are provided in this session to employ them for immersive media processing.
NOTE: In the case of the client-driven edge management (TS 26.501 8.1), only the client-driven immersive media service is applicable.
3. The immersive media service session is set up according to figures 4 and 5.
Figure 4 shows the immersive media service session setup for the case where immersive media metadata is delivered in the downlink direction.
The steps are:
1. The Metadata Manager (e.g. typically a Presentation Engine or similar) discovers the immersive media server and sets up a connection to it. It provides information about its immersive processing and rendering capabilities and the possible XR runtime configuration, e.g the OpenXR configuration may be used for this purpose.
2. In response, the immersive media server creates a description of the immersive media output and the input it expects to receive from the UE.
3. The Metadata Manager requests the buffer streams from the MAF, which in turn establishes a connection to the immersive media server to deliver or stream pose the required immersive metadata for the immersive media service.
4. The Immersive Media Server transmits the immersive metadata to the Media Access Function.
5. The immersive metadata is passed to the Metadata Manager.
6. The immersive media data may also be transmitted to the Media Access Function in addition to the immersive metadata in step 4.
7. The immersive metadata received is then processed by the UE, between the Metadata Manager and the XR Source Manager.
8. The processed media data (using the metadata) or raw buffer frames are passed to the XR Runtime for rendering.
9. The immersive media is composed and rendered.
Figure 5 shows the immersive media service session setup for the case where immersive media metadata is delivered in the uplink direction.
The steps are:
1. The Metadata Manager (e.g. typically a Presentation Engine or similar) discovers the immersive media server and sets up a connection to it. It provides information about its immersive processing and rendering capabilities and the possible XR runtime configuration, e.g the OpenXR configuration may be used for this purpose.
2. In response, the immersive media server creates a description of the immersive media output and the input it expects to receive from the UE.
3. The Metadata Manager requests the buffer streams from the MAF, which in turn establishes a connection to the immersive media server to deliver or stream pose the required immersive metadata for the immersive media service.
4. The Source Manager retrieves immersive metadata from the XR runtime.
5. The Source Manager shares the immersive metadata with the immersive media server.
6. The immersive media server uses the metadata to perform processing related to the immersive media.
7. The process media data using the metadata in step 6 is sent to the Media Access Function.
8. The Media Access function decodes and processes the buffer from or process media data.
9. The raw buffer frames or processed media data is sent to the Metadata Consumer and XR runtime.
10. The Immersive Media is composed and rendered.
Figure 6 is an embodiment of this disclosure showing three different types of immersive metadata set traffic types.
An Immersive Metadata set is defined as a series of metadata units which are delivered together with the same timestamp, as a single set unit.
This disclosure in particular identifies the following immersive metadata:
- Avatar metadata:
-> User representation data (mesh, point cloud, etc) -> large, burst data
-> User animation data (skeleton data) -> small, continuous data
- Pose information metadata:
-> User location and pose information data (quaternion etc) -> small, continuous data
- AI model data:
-> AI model topology/structure/architecture -> large, burst data
-> AI model weight factors etc -> small, bursty/continuous data
IM set type continuous:
- This type is typically used for small, continuous data, where the frequency of the metadata also typically has a high Hz (such as user avatar animation data, or user location/pose information data
IM set type burst interval:
- This type is typically used for metadata which contains both large burst data, followed by small, continuous data, in the same IM set (such as an IM set containing first a user representation data, following by multiple user animation data)
IM set type burst persistent:
- This type is typically used for metadata which contains only large burst data, with no other metadata contained for delivery in the IM set between burst data.
See IM Set Traffic Type [ISTT] under IM set related embodiment 1.
[Immersive media metadata: IM set related embodiment 1]
In one embodiment of this disclosure, an IM set contains RTP (real-time transport protocol) packets of such non-timed immersive metadata, delivered via either WebRTC or MTSI/IMS.
IM set format:
- Each metadata type has it respective encapsulation format
IM set transport protocols:
- Immersive media metadata shall use WebRTC for the real-time transport of the non-timed media metadata. The RTP restrictions for WebRTC as specified in RFC8834 shall apply. Metadata may be carried via SRTP, or alternatively via WebRTC data channel; the usage of the WebRTC data channel shall be in accordance with RFC8831.
IM set header extensions for IM set marking:
- For the determination and calculation of QoS related to the immersive media metadata, packet header extensions are defined, such as RTP header extensions if transport protocol based on RTP is used. The syntax and semantics of these extensions are defined below:
- RTP Header Extension for Immersive Metadata Set Marking
Immersive metadata sets correspond to one or PDUs containing immersive metadata
- End PDU of the IM Set [E] (1 bit): This field is a flag that shall be set to 1 for the last PDU of the IM Set and set to 0 for all other PDUs of the IMS Set.
- End of Data Burst [EDB] (3 bits): The EDB field is 3 bits in length and indicates the end of a Data Burst. The 3 bits encode the End of Data Burst indication as per the encoding and guidelines provided in Clause 4.4.2.6.1.
- IM Set Traffic Type [ISTT] (4 bits): The IM Set traffic type indicates the traffic characteristic of the immersive metadata in the IM set.
- IM Burst size [IBS] (24 bits): Indicates the maximum burst size of a metadata burst when ISTT is either burst persistent or burst interval.
- IM size [IS] (24 bits): Indicates the maximum size of a unit of immersive metadata when ISTT is either continuous or burst persistent.
- IM Set Sequence Number [ISSN] (10 bits): The field encodes the sequence number of the IM Set to which the current PDU belongs acting as a 10-bit numerical identifier for the PDU Set.
- PDU Sequence Number within an IM Set [PSN] (6 bits): The sequence number of the current PDU within the PDU Set. The PSN shall be set to 0 for the first PDU in the PDU Set and incremented monotonically for every PDU in the PDU set in order of transmission from the sender.
- NOTE: A receiver may use the RTP packet sequence number together with the PSN to distinguish between PDUs within a IM Set that contains more than 64 PDUs.
- IM Set Size [ISSize] (24 bits): The IM Set Size indicates the total size of all PDUs of the IM Set to which this PDU belongs. This field is optional and subject to an SDP signaling offer/answer negotiation, where the Application Server may indicate whether it will be able to provide the size of the IM Set for that RTP stream. If not enabled, the field should not be present. If enabled, but the Application Server is not able to determine the IM Set Size for a particular PDU Set, it should set the value to 0 in all PDUs of that IM Set. The PSSize shall indicate the size of a PDU Set including RTP/UDP/IP header encapsulation overhead of its corresponding PDUs. The PSSize is expressed in bytes.
[Immersive media metadata: immersive metadata message containing actual metadata related embodiment 2]
In another embodiment of this disclosure, an immersive metadata message containing such non-timed immersive media metadata is delivered between the server and UE, via WebRTC data channel. The usage of metadata data channel messages to carry actual metadata is typically limited to those with traffic characteristics of small, continuous data, i.e. of type continuous in figure 6.
IM message format:
- In the "3gpp-im" data channel sub-protocol, the IM message content depends on the type of the IM message. The data channel sub-protocol is defined in seperately
- IM message types shall be unique identifiers in the URN format. This embodiment defines a set of IM types and their formats.
IM data channel message format:
- Immersive media shall use WebRTC for the real-time transport of the immersive metadata message. The data channel sub-protocol shall be identified as "3gpp-im-message", which shall be included in the dcmap attribute of the SDP. The transmission order for the data channel shall be set to in-order and the transmission reliability shall be set to reliable. The immersive metadata message format shall be set to text-based and the messages shall be UTF-8 encoded JSON messages.
- A data channel message may carry one or more immersive metadata messages as defined below.
IM message format:
Figure PCTKR2024017376-appb-img-000001
IM message data type:
Figure PCTKR2024017376-appb-img-000002
Message type payload formats (typically small, continuous data):
Pose:
The immersive media client on the XR device periodically transmits a set of pose predictions to the immersive media server. The type of the message shall be set to "urn:3gpp:im:v1:pose".
Each predicted pose shall contain the associated predicted display time and an identifier of the XR space that was used for that pose.
Depending on the view configuration of the XR session, there could be different pose information for each view.
The payload of the message shall be as follows (Pose Prediction Format):
Figure PCTKR2024017376-appb-img-000003
Figure PCTKR2024017376-appb-img-000004
Similar payload formats can be defined for types of "avatar" and "AI/ML" immersive metadata data channel messages.
Avatar: animation format (example)
Figure PCTKR2024017376-appb-img-000005
AI/ML: AI model weight update format (example)
Figure PCTKR2024017376-appb-img-000006
[Immersive media metadata: immersive metadata configuration message embodiment 3]
In another embodiment of this disclosure, an immersive metadata configuration message containing descriptions of such non-timed immersive media metadata is delivered between the server and UE, either directly via WebRTC data channel, or via the SWAP protocol using a SWAP server between the immersive media client and the immersive media server. The app-specific message on IM configuration may be parsed and connected with SDP offers and answers via related SDP parameters to describe the same. More specifically, such immersive metadata configuration messages describe the same related to the IM set as defined in embodiment 1
IM set configuration message parameters:
- End PDU of the IM Set [E] (1 bit): This field is a flag that shall be set to 1 for the last PDU of the IM Set and set to 0 for all other PDUs of the IMS Set.
- End of Data Burst [EDB] (3 bits): The EDB field is 3 bits in length and indicates the end of a Data Burst. The 3 bits encode the End of Data Burst indication as per the encoding and guidelines provided in Clause 4.4.2.6.1.
- IM Set Traffic Type [ISTT] (4 bits): The IM Set traffic type indicates the traffic characteristic of the immersive metadata in the IM set. The traffic type can be set to continuous, burst persistent, or burst interval.
- IM Burst size [IBS] (24 bits): Indicates the maximum burst size of a metadata burst when ISTT is either burst persistent or burst interval.
- IM size [IS] (24 bits): Indicates the maximum size of a unit of immersive metadata when ISTT is either continuous or burst persistent.
- IM Set Sequence Number [ISSN] (10 bits): The field encodes the sequence number of the IM Set to which the current PDU belongs acting as a 10-bit numerical identifier for the PDU Set.
- PDU Sequence Number within an IM Set [PSN] (6 bits): The sequence number of the current PDU within the PDU Set. The PSN shall be set to 0 for the first PDU in the PDU Set and incremented monotonically for every PDU in the PDU set in order of transmission from the sender.
- NOTE: A receiver may use the RTP packet sequence number together with the PSN to distinguish between PDUs within a IM Set that contains more than 64 PDUs.
- IM Set Size [ISSize] (24 bits): The IM Set Size indicates the total size of all PDUs of the IM Set to which this PDU belongs. This field is optional and subject to an SDP signaling offer/answer negotiation, where the Application Server may indicate whether it will be able to provide the size of the IM Set for that RTP stream. If not enabled, the field should not be present. If enabled, but the Application Server is not able to determine the IM Set Size for a particular PDU Set, it should set the value to 0 in all PDUs of that IM Set. The PSSize shall indicate the size of a PDU Set including RTP/UDP/IP header encapsulation overhead of its corresponding PDUs. The PSSize is expressed in bytes.
[Embodiment describing process of sending pose information immersive metadata]
This section describes the process of sending immersive media metadata which is pose information, for services requiring such metadata e.g. XR, AR, MR immersive media services.
1. Based on the architecture shown in figure 3A, the high level call flow shown in figure 3B is commenced. Step 3 in figure 3B for session setup is then progressed.
2. Since pose information is typically sent in the uplink direction, step 3 in figure 3B is progressed as shown in the procedure defined by figure 5.
3. For the configuration of such pose information, the immersive metadata configuration message as described in embodiment 3 is exchanged between the UE and server, in step 3 of figure 5. The parameters inside this configuration message enables the required QoS to be assigned for the metadata to be transported. For pose information, the IM Set Traffic Type [ISTT] is typically set to continuous.
4. Metadata (pose information) is delivered from the UE to the server in step 5 of figure 5 in two different possible manners:
a. As RTP or SCTP packets, via WebRTC or MTSI. With the header extensions as specified in embodiment 1.
b. As immersive metadata messages which contain the actual metadata, with the message format type for pose information, as specified in embodiment 2.
[Embodiment describing flow of sending avatar immersive metadata]]]
This section describes the process of sending immersive media metadata which is avatar immersive metadata, for services requiring such metadata e.g. avatar calls, or games with avatars.
1. Based on the architecture shown in figure 3A, the high level call flow shown in figure 3B is commenced. Step 3 in figure 3B for session setup is then progressed.
2. Avatar immersive metadata can typically be sent in both the uplink and downlink directions, step 3 in figure 3B is progressed as shown in the procedure defined by either figure 4 or figure 5.
3. For the configuration of such avatar metadata information, the immersive metadata configuration message as described in embodiment 3 is exchanged between the UE and server, in step 3 of figure 5. The parameters inside this configuration message enables the required QoS to be assigned for the metadata to be transported. For avatar information, the IM Set Traffic Type [ISTT] is typically set to burst persistent when both the avatar model and skeleton is sent, or to continuous when only the skeleton information is sent.
4. Metadata (avatar information) is delivered from the server to the UE in step 4 or figure 4, or from the UE to the server in step 5 of figure 5, in two different possible manners:
a. As RTP or SCTP packets, via WebRTC or MTSI. With the header extensions as specified in embodiment 1.
b. As immersive metadata messages which contain the actual metadata, with the message format type for pose information, as specified in embodiment 2.
[Embodiment describing flow of sending AI/ML immersive metadata]
This section describes the process of sending immersive media metadata which is AI/ML immersive metadata, for services requiring such metadata e.g. AI/ML processing of media for upscaling, vision applications etc.
1. Based on the architecture shown in figure 3A, the high level call flow shown in figure 3B is commenced. Step 3 in figure 3B for session setup is then progressed.
2. AI/ML immersive metadata can typically be sent in both the uplink and downlink directions, step 3 in figure 3B is progressed as shown in the procedure defined by either figure 4 or figure 5.
3. For the configuration of such AI/ML metadata information, the immersive metadata configuration message as described in embodiment 3 is exchanged between the UE and server, in step 3 of figure 5. The parameters inside this configuration message enables the required QoS to be assigned for the metadata to be transported. For AI/ML information, the IM Set Traffic Type [ISTT] is typically set to burst persistent when both the AI model topology/architecture and updates of biases and weights are constantly sent, or to burst interval when only the AI model topology is sent periodically (i.e. updated).
4. Metadata (AI/ML information) is delivered from the server to the UE in step 4 or figure 4, or from the UE to the server in step 5 of figure 5, in two different possible manners:
a. As RTP or SCTP packets, via WebRTC or MTSI. With the header extensions as specified in embodiment 1.
b. As immersive metadata messages which contain the actual metadata, with the message format type for pose information, as specified in embodiment 2.
FIG. 7 illustrates an electronic device according to an embodiment of the present disclosure.
Referring to the FIG. 7, the electronic device 700 may include a processor 710, a transceiver 720 and a memory 730. However, all of the illustrated components are not essential. The electronic device 700 may be implemented by more or less components than those illustrated in FIG. 7. In addition, the processor 710 and the transceiver 720 and the memory 730 may be implemented as a single chip according to another embodiment.
The electronic device 700 may correspond to the UE described above.
The aforementioned components will now be described in detail.
The processor 710 may include one or more processors or other processing devices that control the provided function, process, and/or method. Operation of the electronic device 700 may be implemented by the processor 710.
The transceiver 720 may include a RF transmitter for up-converting and amplifying a transmitted signal, and a RF receiver for down-converting a frequency of a received signal. However, according to another embodiment, the transceiver 720 may be implemented by more or less components than those illustrated in components.
The transceiver 720 may be connected to the processor 710 and transmit and/or receive a signal. The signal may include control information and data. In addition, the transceiver 720 may receive the signal through a wireless channel and output the signal to the processor 710. The transceiver 720 may transmit a signal output from the processor 710 through the wireless channel.
The memory 730 may store the control information or the data included in a signal obtained by the electronic device 700. The memory 730 may be connected to the processor 710 and store at least one instruction or a protocol or a parameter for the provided function, process, and/or method. The memory 730 may include read-only memory (ROM) and/or random access memory (RAM) and/or hard disk and/or CD-ROM and/or DVD and/or other storage devices.
FIG. 8 illustrates a node according to an embodiment of the present disclosure.
Referring to the FIG. 8, the node 800 may include a processor 810, a transceiver 820 and a memory 830. However, all of the illustrated components are not essential. The node 800 may be implemented by more or less components than those illustrated in FIG. 8. In addition, the processor 810 and the transceiver 820 and the memory 830 may be implemented as a single chip according to another embodiment.
The node 800 may correspond to the base station, the server and/or the immersive media server described above.
The aforementioned components will now be described in detail.
The processor 810 may include one or more processors or other processing devices that control the provided function, process, and/or method. Operation of the node 800 may be implemented by the processor 810.
The transceiver 820 may include a RF transmitter for up-converting and amplifying a transmitted signal, and a RF receiver for down-converting a frequency of a received signal. However, according to another embodiment, the transceiver 820 may be implemented by more or less components than those illustrated in components.
The transceiver 820 may be connected to the processor 810 and transmit and/or receive a signal. The signal may include control information and data. In addition, the transceiver 820 may receive the signal through a wireless channel and output the signal to the processor 810. The transceiver 820 may transmit a signal output from the processor 810 through the wireless channel.
The memory 830 may store the control information or the data included in a signal obtained by the node 800. The memory 830 may be connected to the processor 810 and store at least one instruction or a protocol or a parameter for the provided function, process, and/or method. The memory 830 may include read-only memory (ROM) and/or random access memory (RAM) and/or hard disk and/or CD-ROM and/or DVD and/or other storage devices.
The methods according to the embodiments of the disclosure described in the specification or the claims may be implemented by hardware, software, or a combination thereof.
In a case where the methods are implemented by software, a computer-readable storage medium may be provided to store one or more programs (software modules). The one or more programs stored in the computer-readable storage medium may be configured for execution by one or more processors in an electronic device. The one or more programs may include instructions for causing the electronic device to execute the methods according to the embodiments of the disclosure described in the specification or the claims.
These programs (software modules or software) may be stored in random access memories, nonvolatile memories including flash memories, read only memories (ROMs), electrically erasable programmable ROMs (EEPROMs), magnetic disc storage devices, compact disc-ROMs (CD-ROMs), digital versatile discs (DVDs), other types of optical storage devices, or magnetic cassettes. Also, the programs may be stored in a memory constituted by a combination of some or all of such storage devices. Also, each of the constituent memories may be provided in plurality.
Also, the programs may be stored in an attachable storage device that may be accessed through a communication network such as Internet, Intranet, local area network (LAN), wide LAN (WLAN), or storage area network (SAN), or through a communication network constituted by any combination thereof. Such a storage device may be connected through an external port to an apparatus performing an embodiment of the disclosure. Also, a separate storage device on a communication network may be connected to an apparatus performing an embodiment of the disclosure.
In the above particular embodiments of the disclosure, the components included in the disclosure are expressed in the singular or plural according to the presented particular embodiments. However, the singular or plural expressions are selected suitably according to the presented situations for convenience of description, the disclosure is not limited to the singular or plural components, and the components expressed in the plural may even be constituted in the singular or the components expressed in the singular may even be constituted in the plural.
Meanwhile, the embodiments of the disclosure disclosed in this specification and drawings are only specific examples to easily explain the technical content of the disclosure and help understand the disclosure, and are not intended to limit the scope of the disclosure. That is, it is obvious to a person skilled in the art to which the disclosure pertains that other modified examples based on the technical idea of the disclosure are possible. In addition, each of the above embodiments can be combined and operated with each other as needed. For example, parts of one embodiment of the disclosure and another embodiment can be combined with each other to operate a base station and UE. In addition, the embodiments of the disclosure can be applied to other communication systems, and other modified examples based on the technical idea of the embodiments can also be implemented.
Although the present disclosure has been described with various embodiments, various changes and modifications may be suggested to one skilled in the art. It is intended that the present disclosure encompass such changes and modifications as fall within the scope of the appended claims.

Claims (15)

  1. A method performed by a terminal in a wireless communication system, the method comprising:
    transmitting, to an immersive media server, a first message for creating an immersive media session;
    receiving, from the immersive media server, a second message including description of immersive media output;
    establishing a transport connection for an immersive media service; and
    receiving, from the immersive media server, a third message for an immersive metadata (IM) set,
    wherein the third message include description information for the IM set, and
    wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  2. The method of claim 1,
    wherein the description information is included in a real-time transport protocol (RTP) header and the IM set corresponding to the description information is included in RTP packets.
  3. The method of claim 1, further comprising:
    receiving, from the immersive media server, fourth messages including the IM set based on description information included in the third message,
    wherein each message of the fourth message is formatted based on an immersive metadata message data type,
    wherein immersive metadata message data type is defined based on id field, type field and message field, and
    wherein the message field includes message content depend on a type of message identified based on the type field.
  4. The method of claim 1,
    wherein the description information further includes at least one of End protocol data unit (PDU) of the IM set, End of data burst, IM burst size, IM set sequence number, PDU sequence number within the IM set, or IM set size.
  5. A method performed by an immersive media server in a wireless communication system, the method comprising:
    receiving, from a terminal, a first message for creating an immersive media session;
    transmitting, to the terminal, a second message including description of immersive media output;
    establishing a transport connection for an immersive media service; and
    transmitting, to the terminal, a third message for an immersive metadata (IM) set,
    wherein the third message include description information for the IM set, and
    wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  6. The method of claim 5,
    wherein the description information is included in a real-time transport protocol (RTP) header and the IM set corresponding to the description information is included in RTP packets.
  7. The method of claim 5, further comprising:
    transmitting, to the terminal, fourth messages including the IM set based on description information included in the third message,
    wherein each message of the fourth message is formatted based on an immersive metadata message data type,
    wherein immersive metadata message data type is defined based on id field, type field and message field, and
    wherein the message field includes message content depend on a type of message identified based on the type field.
  8. The method of claim 5,
    wherein the description information further includes at least one of End protocol data unit (PDU) of the IM set, End of data burst, IM burst size, IM set sequence number, PDU sequence number within the IM set, or IM set size.
  9. A terminal in a wireless communication system, the terminal comprising:
    a transceiver; and
    a controller configured to:
    transmit, to an immersive media server, a first message for creating an immersive media session,
    receive, from the immersive media server, a second message including description of immersive media output,
    establish a transport connection for an immersive media service, and
    receive, from the immersive media server, a third message for an immersive metadata (IM) set,
    wherein the third message include description information for the IM set, and
    wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  10. The terminal of claim 9,
    wherein the description information is included in a real-time transport protocol (RTP) header and the IM set corresponding to the description information is included in RTP packets.
  11. The terminal of claim 9,
    wherein the controller is configured to receive, from the immersive media server, fourth messages including the IM set based on description information included in the third message,
    wherein each message of the fourth message is formatted based on an immersive metadata message data type,
    wherein immersive metadata message data type is defined based on id field, type field and message field, and
    wherein the message field includes message content depend on a type of message identified based on the type field.
  12. The terminal of claim 9,
    wherein the description information further includes at least one of End protocol data unit (PDU) of the IM set, End of data burst, IM burst size, IM set sequence number, PDU sequence number within the IM set, or IM set size.
  13. An immersive media server in a wireless communication system, the immersive media server comprising:
    a transceiver; and
    a controller configured to:
    receive, from a terminal, a first message for creating an immersive media session,
    transmit, to the terminal, a second message including description of immersive media output,
    establish a transport connection for an immersive media service, and
    transmit, to the terminal, a third message for an immersive metadata (IM) set,
    wherein the third message include description information for the IM set, and
    wherein the description information includes a traffic type of the IM set and IM size, the traffic type indicating a traffic characteristic of the IM set and the IM size indicating a maximum size of a unit of an IM.
  14. The immersive media server of claim 13,
    wherein the description information is included in a real-time transport protocol (RTP) header and the IM set corresponding to the description information is included in RTP packets.
  15. The immersive media server of claim 13,
    wherein the controller is configured to transmit, to the terminal, fourth messages including the IM set based on description information included in the third message,
    wherein each message of the fourth message is formatted based on an immersive metadata message data type,
    wherein immersive metadata message data type is defined based on id field, type field and message field,
    wherein the message field includes message content depend on a type of message identified based on the type field, and
    wherein the description information further includes at least one of End protocol data unit (PDU) of the IM set, End of data burst, IM burst size, IM set sequence number, PDU sequence number within the IM set, or IM set size.
PCT/KR2024/017376 2023-11-06 2024-11-06 Method and apparatus of qos analytics for immersive media metadata Pending WO2025100925A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2023-0151981 2023-11-06
KR20230151981 2023-11-06

Publications (1)

Publication Number Publication Date
WO2025100925A1 true WO2025100925A1 (en) 2025-05-15

Family

ID=95695607

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2024/017376 Pending WO2025100925A1 (en) 2023-11-06 2024-11-06 Method and apparatus of qos analytics for immersive media metadata

Country Status (1)

Country Link
WO (1) WO2025100925A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190005155A (en) * 2016-04-08 2019-01-15 아이디에이씨 홀딩스, 인크. Physical layer multiplexing of different types of traffic in a 5G system
US20210144581A1 (en) * 2019-11-12 2021-05-13 Samsung Electronics Co., Ltd. Flexible high capacity-radio network temporary identifier
KR20220011688A (en) * 2019-05-20 2022-01-28 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Immersive media content presentation and interactive 360° video communication
KR20230116706A (en) * 2022-01-28 2023-08-04 주식회사 케이티 Method and apparatus for controlling traffic

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190005155A (en) * 2016-04-08 2019-01-15 아이디에이씨 홀딩스, 인크. Physical layer multiplexing of different types of traffic in a 5G system
KR20220011688A (en) * 2019-05-20 2022-01-28 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Immersive media content presentation and interactive 360° video communication
US20210144581A1 (en) * 2019-11-12 2021-05-13 Samsung Electronics Co., Ltd. Flexible high capacity-radio network temporary identifier
KR20230116706A (en) * 2022-01-28 2023-08-04 주식회사 케이티 Method and apparatus for controlling traffic

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SRINIVAS GUDUMASU, INTERDIGITAL COMMUNICATIONS: "[5G_RTP] Signaling PDU set importance in RTP HE", 3GPP DRAFT; S4-230715; TYPE DISCUSSION, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE, vol. SA WG4, no. Online; 20230417 - 20230421, 23 April 2023 (2023-04-23), Mobile Competence Centre ; 650, route des Lucioles ; F-06921 Sophia-Antipolis Cedex ; France, XP052295542 *

Similar Documents

Publication Publication Date Title
WO2023191421A1 (en) Device and method for processing application data in wireless communication system
WO2023075511A1 (en) Method and apparatus for verifying compliance with ue route selection policy
WO2023153787A1 (en) Methods and apparatus for policy management in media applications using network slicing
WO2023191502A1 (en) Method and device for providing access path in wireless communication system
WO2023146310A1 (en) Method and apparatus for supporting change of network slice in wireless communication system
WO2024039173A1 (en) Method and apparatus for providing media-based qos for real-time communication service in mobile communication systems
WO2022270997A1 (en) Methods and apparatus for application service relocation for multimedia edge services
WO2024043763A1 (en) Method and apparatus for quality-of-service assurance for webrtc sessions in 5g networks
WO2025095598A1 (en) Method and apparatus for supporting avatar call in wireless communication system
WO2024167338A1 (en) Method and device for providing ai/ml media service in wireless communication system
WO2025100925A1 (en) Method and apparatus of qos analytics for immersive media metadata
WO2025028913A1 (en) Method and device for forward error correction signaling based on packet data unit set in wireless communication system
WO2024237615A1 (en) Method and device for providing artificial intelligence/machine learning media service using user equipment capability negotiation in wireless communication system
WO2024071925A1 (en) Methods and apparatus for ai/ml traffic detection
WO2024080840A1 (en) Method and apparatus for providing ai/ml media services
WO2025174060A1 (en) Method and apparatus for traffic steering in wireless communication system
WO2025095672A1 (en) Method and device for supporting dynamic policy in wireless communication system
WO2023146322A1 (en) Method and apparatus for service of ultra-reliable and low-latency communication in a mobile communication system
WO2025101013A1 (en) Method and apparatus for relaying data in wireless communication system
WO2025211640A1 (en) Method and apparatus for managing network function in communication system
WO2024210397A1 (en) Method and apparatus for seamless dynamic configuration for split inferencing in a mobile communication system
WO2024232582A1 (en) Method and apparatus for managing configuration information for continuous conditional pscell change in wireless communication system
WO2024035135A1 (en) Method and apparatus for managing edge computing service session in wireless communication system
WO2025239698A1 (en) Method and device for session negotiation to use ims data channel service
WO2022203425A1 (en) A method and apparatus for providing multicast broadcast services in a telecommunication network

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24889112

Country of ref document: EP

Kind code of ref document: A1