MXPA98007713A

MXPA98007713A - Generation of a bits flow containing binary image / audio data that multiplexes with a code defining an object in format as

Info

Publication number: MXPA98007713A
Application number: MXPA/A/1998/007713A
Authority: MX
Inventors: Suzuki Teruhiko
Original assignee: Sony Corporation
Priority date: 1997-09-22
Filing date: 1998-09-22
Publication date: 1999-09-20

Abstract

The present invention relates to a controller system that directs a storage device for transferring data of the description of a scene, corresponding to a user request signal. An analysis circuit that extracts a URL (Uniform Resource Locator) including the data of the description of the scene, and causes the storage device to produce the elementary flow and the flow of objects corresponding to the URL. After extracting an object descriptor from the object stream, a generator generates an ID of this object descriptor and supplies it to an encoder. In addition, the generator adds the ID to the object descriptor and sends the object descriptor added to the ID to a multiplexer. The multiplexer multiplexes the data of the description of the scene including the ID that has been converted into binary format, the object descriptor and the elementary flows in a multiplexed stream for the output of the same ones.

Description

GENERATION OF A BITS FLOW CONTAINING BINARY IMAGE / AUDIO DATA THAT MULTIP EXA WITH A CODE DEFINING AN OBJECT IN ASCII FORMAT BACKGROUND OF THE INVENTION The present invention relates to the coding and decoding apparatus and the method for registering a moving image signal on a recording medium such as an optical disc or a magnetic tape and reproducing it for display on a display device. The present invention can be used in video conferencing systems, video telephone systems, transmission equipment, multimedia database retrieval systems and the like in such a way that a moving image signal is transmitted from one transmission side to another. a receiving side through a transmission line and being received and displayed on the receiving side, the present invention can also be used to edit and record a moving image signal. In a video conference system or a video telephone system in which a moving picture signal is transmitted to a remote location, in order to efficiently use a transmission line, an image signal is compressed / encoded using linear correlation or Frame correlation of the video signal. In recent years, with the improvement in computer processing, the information terminals of moving images that use a computer are increasingly used. In these systems the information is transmitted to remote locations through a transmission line such as a network. In this case, in order to efficiently use the transmission line, a signal that is to be transmitted as a picture, sound, or computer data is transient or compressed / coded. On the side of a terminal (reception side), the compressed / encoded signal that has been transmitted is decoded by a predetermined decoding method corresponding to the encoding method in an image, sound or original computing data that leaves the terminal by means of a display device, horns or the like. Previously, the signal of the transmitted image or the like was simply transferred, as such, in a presentation device. But in information terminals that use a computer, a plurality of images, sounds or computer data can be handled or displayed in a two-dimensional or three-dimensional space subjected to a certain conversion process. This type of process can be performed in such a way that the information of a two-dimensional or three-dimensional space is described by a determined method on one side of the transmission, and the terminal side (reception side) executes a conversion process on an image signal or the like according to the description. A common example to describe spatial information is VRML (Virtual Reality Modeling Language), which has been standardized by ISO-IEC / JTC1 / SC24. The latest version of VRML 2.0 is described in IS14772. VRML is a language to describe a three-dimensional space and define data to describe attributes, forms, etc., of a three-dimensional space. This data is known as a node. To describe a three-dimensional space it is necessary to describe in advance how to combine the nodes. Each node includes data indicating color, texture, etc., data indicating polygon shapes and other information. In the information terminals that use a computer, an object determined by GC (computer graphics) is generated according to the description of the aforementioned VRML using polygons, etc. With VRML it is possible to link a text to a three-dimensional object that has been generated in this way and that has been composed of polygons. A node known as 'texture' is defined for still images and a node known as * moving texture "is defined for moving images. In these nodes the information is described (a file name, start time or completion time of the presentation, etc.) in the texture that is going to be linked. With reference to Figure 23, a texture-binding process (hereinafter referred to as the texture mapping process, where appropriate, will be described.) Figure 23 shows an example of the configuration of the apparatus for texture mapping As shown in Fig. 23, a group of memories 200 includes a texture memory 200a, a gray scale memory 200b, and a memory of a three-dimensional object 200c The texture memory 200a stores texture information which is input externally The memory of the gray scale 200b and the memory of the three-dimensional object 200c store key data indicating the degree of penetration / transparency of the texture and the information of the three-dimensional object which are also introduced externally. The information of the three-dimensional object is necessary for the generation of polygons and is related to the illumination.A presentation circuit 201 generates a three-dimensional object generating po Let us know on the basis of the formation of the three-dimensional object that is stored in the memory of the three-dimensional object 200c of the group of memories 200. Furthermore, based on the data of the three-dimensional object, the presentation circuit 201 reads the information of the texture and the data key indicating the degree of penetration / transparency of the texture from the memories 200a and 200b, respectively, and executing a superimposition process on the texture and a corresponding background image in relation to the key data. The key data indicates the degree of penetration of the texture into a corresponding position, ie the transparency of an object in the corresponding position. A two-dimensional conversion circuit 202 sends a two-dimensional image signal that is obtained by mapping the three-dimensional object that has been generated by the presentation circuit 201 to a two-dimensional plane based on the information from the viewpoint that is provided from the outside. Where texture is a moving image, the previous process is executed frame by frame. With the VRML, it is possible to handle, as texture information, data that has been compressed according to the GEFU (United Photographic Experts Group) that is commonly used in the high efficiency coding of a still image, the GEIM (Group of Experts of Moving Images) to encode a mobile image, or similar, with high efficiency. Where an image so compressed as texture is used, the texture (image) is decoded by a decoding process corresponding to a coding scheme. The decoded image is stored in the texture memory 200a of the group of memories 200 and is subjected to a process similar to the previous process.

The display circuit 201 joins the texture information that is stored in the texture memory 200a to an object at a certain position regardless of the format of an image and whether the image is a moving image or a still image. Therefore, the texture that can be attached to a certain polygon is stored in a memory. During the transmission of information of the three-dimensional object it is necessary to transmit the coordinates "" As three-dimensional of each vertex. They are necessary, 32-bit real numbers for each coordinate component. Real-data of 32 bits or more are also required for attributes such as reflection of each three-dimensional object. Therefore, the information that is to be transmitted is enormous and also increases during the transmission of a complex three-dimensional object or a moving image. Therefore, during the transmission of three-dimensional information as the previous case or texture information through a transmission line, it is necessary to transmit compressed information to improve the efficiency of the transmission. A common example of high-efficiency coding schemes (compression) for a moving image is the GEIM scheme (Group of Experts of Moving Images, encoding moving images for storage), which is described in the ISO-IEC / JTC1 / SC2 / WG11 and was proposed as a standard. The MPEG employs a mixed scheme that is a combination of the predictive coding of motion compensation and DCT coding (transformed discrete cosine). To accommodate various applications and functions, the MPEG defines various profiles (classification of functions) and levels (quantities as an image size). The most fundamental aspect is a main level of a main profile (MPTML). An example of configuration of an MPTML coding device (image encoding device) of the PMEG scheme will be described with reference to FIG. 24. An input image signal is first input to a frame memory 1, and then encodes to a predetermined order. The image data to be encoded is entered into a detection circuit of the motion vector (ME) 2 per acro blocks. The detection circuit of the movement vector 2 processes image data of each frame as an I image, an image P or an image B according to a predetermined sequence. That is, it is predetermined if the images of the respective frames that are entered into sequences are processed as I, P and B images (for example, these are processed in the order of I, B, P, B, P, ... , B, P). The detection circuit of the movement vector 2 realizes the compensation of the movement in relation to a predetermined reference frame and detects its vector of movement. Motion compensation (interframe prediction has three prediction modes, ie, forward prediction, backward prediction, and bidirectional prediction) Only "forward prediction is available as a prediction mode of the P image, and all three modes of prediction, ie, forward prediction, backward prediction and bidirectional prediction are available as a mode of image prediction B. The detection circuit of motion vector 2 selects a prediction mode that minimizes the error of prediction and generates a corresponding prediction vector The resulting prediction error is compared with, for example, the variance of an acro block that is to be coded If the variance of the macro block is smaller than the prediction error, it does not no prediction is made about the macro block and the coding between frames is performed, in this case, the prediction mode is predicted ion between images (intra). A motion vector detected by the detection circuit of the motion vector 2 and the aforementioned prediction mode are introduced. to a variable length encoder circuit 6 and a motion compensation circuit (MC) 12. The motion compensation circuit 12 generates prediction image data based on a given motion vector and inputs it to the operation circuits 3. and 10. The operation circuit 3 calculates difference data indicating a difference between the value of the macro block to be encoded and the value of the image prediction data and a calculation result to a DCT 4 circuit. the case of a mode between macro blocks, the operation circuit 3 sends, as such, the macro block data to be encoded to the DCT circuit 4. The DCT circuit 4 converts the entered data into DCT coefficients by submitting the data to the DCT (discrete cosine transform). The coefficients of the DCT are introduced into a quantization circuit (Q) 5, where they are quantized with a quantization step corresponding to a quantity of data storage (amount of storage in buffer or buffer) of a transmission buffer 7. The quantized coefficients (data) are input to the variable length encoder circuit 6. The variable length encoder circuit 6 converts the quantized data that is supplied from the quantization circuit 5 into a variable length code such as a Huffman code. The variable length encoder circuit 6 also receives the quantization step (scale) from the quantization circuit 5 and the prediction mode (which indicates whether the prediction was established between images, forward prediction, backward prediction or bidirectional prediction) and the motion vector from the detection circuit of the motion vector 2, and performs the variable length coding in it. The transmission buffer 7 temporarily stores the received encoded data and sends a control signal of qua * * ication corresponding to the amount of storage. - adds the quantization circuit 5. When the residual amount of data has been increased to a permissible upper limit, the transmission buffer 7 controls to reduce the amount of data of the quantization data by increasing the quantization scale of the quantization circuit 5 using the quantization control signal. On the contrary, when the amount of residual data has decreased to the lower allowable limit, the transmission buffer 7 controls to increase the amount of data of the quantization data by decreasing the quantization scale of the quantization circuit 5 using the control signal of quantification. In this way the overflow or negative overflow of the transmission circuit 7 is prevented. The encoded data stored in the transmission buffer 7 are read with predetermined synchronization and exit as a bit stream to a transmission line. On the other hand, the quantized data that is sent from the quantization circuit 5 are introduced to a dequantization circuit (IQ) 8, where they are dequantized in accordance with a quantization step provided from the quantization circuit 5. The data-sent (coefficients of the DCT) from the dequantization circuit 8 are introduced to an IDCT circuit (inverse DCT) 9, then subjected to the processing of the inverse DCT, and stored in a frame memory (FM) 11 through the circuit of operation 10. Next, an example of a decoder (image decoding apparatus) of MP @ ML of the MPEG will be described in relation to FIG. 25. The encoded image data (bit stream) that has been transmitted through a transmission line is received by a receiver circuit (not shown), or reproduced by a playback circuit, temporarily stored in a reception buffer. , and then sent to a variable length decoding circuit (IVLC) 22. When performing variable length decoding on the data sent from the reception buffer 21. The variable length decoder circuit 22 sends a motion vector and a mode of prediction to a motion compensation circuit 27 and a quantization step to a dequantization circuit 23. In addition, the variable length decoder circuit '22 sends the quantized, decoded data to the dequantization circuit 23. The dequantization circuit 23 dequantize the quantized data that is sent from the variable length decoder circuit 22 according to the quantization step also provided from the variable length decoder circuit 22 and sends the resulting data (coefficients of the DCT) to an IDCT circuit 24. The data (coefficients of the DCT) which are the output of the dequantization circuit 23 are subjected to the inverse DCT in the IDCT circuit 24 and sent to an operation circuit 25 as output data. If the output data provided from the IDCT circuit 24 (the input bit stream) are image data I, is the output of the operation circuit 25 as image data and is then sent to a frame memory 26 and stored there for the generation of the prediction data for the image data (image data P or B) which will be introduced into the operation circuit 25. These image data are also sent, as such, to the external system as a reproduction image. If the output data provided from the IDCT circuit 24 (the input bit stream) is an image P or B, the motion compensation circuit 27 generates an image prediction based on the image data stored in the memory. of frame 26 according to the motion vector and the prediction mode that are provided from the variable length decoder circuit 22, and sends them to the operation circuit 25. The operation circuit 25 adds output data that is sent from the IDCT circuit 24 and the image prediction data that are provided from the motion compensation circuit 27 to produce output image data. In the case of an image P, the output data of the operation circuit 25 is input to the frame memory 26 and stored there as image prediction data (a reference image) for an image signal which will subsequently be decoded . In the MPEG several different profiles and levels of the MP @ ML are defined, different tools are prepared. Scalability is one of these tools. In the MEPG, the scalable coding scheme that makes scalability is introduced to accommodate different image sizes and frame rates. For example, in the case of spatial scalability, an image signal having a small image size can be decoded by decoding only lower layer bit streams, and an image signal having a large image size can be decoded by decoding Bitstreams of the lower layer and the upper layer. A spatial scalability coder will be described in relation to Figure 26. In the case of spatial scalability, the lower layer corresponds to the image signals having a small image size and the upper layer corresponds to image signals having a big size. A lower layer image signal is first input to a frame memory 1 and then encoded in the same way as in the case of the MP @ ML. However, not only 1. 'of the operation circuit 10 is sent to the frame 11, which is used as a lower layer image prediction data, but it is also used as a prediction data of upper layer image after being enlarged to the same image size as the image size of the upper layer by means of an image amplification circuit (ascending sample) 31. According to FIG. 26, a layer image signal The upper vector is input to a frame memory 51. A detection circuit of the motion vector 52 determines a motion vector and a prediction mode in the same way as in the case of the MP @ ML. A motion compensation circuit 62 generates image prediction data according to the motion vector and the prediction mode that has been determined by the detection circuit of the motion vector 52 and sends it to a weighting circuit (W) 34. The weighting circuit 34 multiplies the data of the image prediction by a weight w and sends the weighted image prediction data to an operation circuit 33, As already described, the output data (image data) of the operation circuit 10 are introduced to the amplification circuit of the image 31. The amplification circuit of the image 31 amplifies the image data that have been generated by the operation circuit 10 to make its size equal to the size of the image of the upper layer and sends the amplified image data to a circuit of weighting (1-W) 32. The weighting circuit 32 multiplies the amplified image data of the amplification circuit of the image 31 by a weight (1-W) and sends the result to the operation circuit 33. The circuit of operation 33 adds the output data of the weighting circuits 32 and 34 and sends the result to an operation circuit 53 as an image prediction data. The output data of the operation circuit 33 is also input to an operation circuit 60, it is added to the output data of an inverse DCT circuit there, and then entered into a frame memory 61 for later use as an image prediction data for the data of the images to be encoded. The operation circuit 53 calculates a difference between the output data of the image data to be encoded and the output data of the operation circuit 33, and sends the result as difference data. However, in the case of the macro frame coding block, the operation circuit 53 sends, as such, the image data to be encoded to a DCT circuit 54. The DCT circuit 54 performs the DCT (discrete cosine transform) ) at the output of the operation circuit 53, to generate the coefficients of the DCT, which are sent to a quantization circuit 55. As in the case of the MP @ ML, the quantization circuit 55 quantifies the DCT coefficients according to a quantization scale that is based on the amount of data storage of a transmission buffer 57 and other factors, and sends a result (quantized data) to a variable length encoder circuit 56. The variable length encoder circuit 56 performs the encoding variable length on the quantized data (quantized DCT coefficients) and sends a result as a higher layer bit stream through the transmission buffer 57. The salt data of the quantization circuit 55 are dequantified by means of a dequantization circuit 58 with the quantization scale that was used in the quantization circuit 55, they are subjected to the inverse DCT in the inverse DCT circuit 59, and then introduced into the operation circuit 60 The operation circuit 60 adds the outputs of the operation circuit 33 and the inverse DCT circuit 59 and introduces a result into the frame memory 61. The variable length encoder circuit 56 also receives the motion vector and the prediction mode that were detected by the detection circuit of the motion vector 52, the quantization scale that was used in the quantization circuit 55 and the weights W that were used in the weighting circuits 32 and 34, which are encoded in the coding circuit of variable length 56 and then they are transmitted. Next, an example of a spatial scalability decoder will be described with reference to Fig. 27. A lower layer bit stream is introduced into the reception buffer 21 and then decoded in the same way as in the case of the MPTML. However, not only the output of the operation circuit 25 is sent to the external system and stored in the frame memory 26 for use as an image prediction data for an image signal to be encoded afterwards, but is also used as a higher layer image prediction data after being amplified to the same image size as a higher layer image size by an amplification circuit of the signal of the image 81. A higher layer bit stream is sent to a variable length decoder circuit 72 through a reception buffer 71, and a variable length code is encoded there. That is, a quantization scale, a motion vector, a prediction mode and a weighting coefficient (weight W) are decoded together with DCT coefficients. The DCT coefficients (quantized data) decoded by the variable length decoder circuit 72 are dequantized by a dequantization circuit 73 using 1: * decoded quantization creep, are subjected to the reverse in an inverted DCT circuit 74 and then sent to an operation circuit 75. A motion compensation circuit 77 generates image prediction data according to the decoded motion vector and the prediction mode and inputs them into a weighting circuit 84. The weighting circuit 84 multiplies the output of the motion compensation circuit 77 by means of the decoded weight W and sends a result to an operation circuit 83. Not only the output of the operation circuit 25 is sent as lower layer reproduction image data and output to the frame memory 26, but are also sent to a weighting circuit 83 after being amplified by the signal amplification circuit of the image 81 to have the same image size as the image size of the upper layer. The weighting circuit 82 multiplies the result of the amplification circuit of the signal of the image 81 by (1-W) using the decoded weight W and sends the result to the operation circuit 83. The operation circuit 83 adds the results of the Weighting circuits 82 and 84 and send the result to the operation circuit. The operation circuit 85 adds the result of the inverse DCT circuit 74 and the result of the operation circuit 83 and sends the result as reproduction image data of the upper layer and also sends it to the frame memory 76 for use as data of Image prediction for image data that will be encoded later. The above description applies to a process for a luminance signal. A color difference signal is processed in a similar way. The motion vector to be used in the processing of a color difference signal is obtained by halving a motion vector for a luminance signal in both vertical and horizontal directions. Although the MPEG scheme has been described in the foregoing, several other high efficiency coding schemes for a moving image have also been standardized. For example, the ITU-T (International Telecommunication Union) has standardized the H.261 and H.263 schemes as coding for communication. Basically, like the MPEG scheme, the H.261 and H.263 are a combination of predictive coding by motion compensation and DCT coding. An encoding apparatus and a decoding apparatus according to H.261 or H.263 are configured in the same way as in the MPEG scheme although the details of the information of the headers, etc., are different. Furthermore, in the MPEG scheme described above, the standardization of a new highly efficient coding scheme known as MPEG 4 is now underway. The main characteristics of MPEG 4 are that an image is encoded object by object (an image is encoded in units of a plurality of images) and that the image can be modified on the basis of the object object. That is, on the decoding side, the images of the respective objects or a plurality of images can be combined to reconstruct an image. In IS0-IEC / JTC1 / SC29 / WG11, as already mentioned, the standardization work for MPEG 4 is now underway. In this work we study a management scheme of a natural image and an image of computer graphics within a common structure. In this scheme, a three-dimensional object is described using the VRML, and a moving image and sound or audio are compressed according to the MPEG standard. A scene containing a plurality of three-dimensional objects, moving images, etc., is described according to the VRML. The description of a scene (hereinafter abbreviated as a scene description), the description of a three-dimensional object and the AV data consisting of a moving image, sound or audio compressed according to the MPEG scheme, which have been obtained in the former form, receive synchronization marks and are multiplexed by a multiplexing circuit in a bit stream, which is transmitted as a multiplexed bit stream. In a receiving terminal that has received a multiplexed bit stream, a demultiplexing circuit extracts the description of the scene, the description of a three-dimensional object, and the AV stream (a stream corresponding to the AV data), decoders decode the respective bit streams, and a scene that is reconstructed by a scene construction circuit is displayed on a screen device. In the previous method, it is necessary to clarify a relationship between nodes that are described according to VRML (description of the three-dimensional objects and the description of the scene) and the AV data of moving images, sounds, audio, etc. For example, it is necessary to indicate that AV flux must be mapped in its texture with a certain three-dimensional object. In the VRML, the texture that is to be attached to the (mapped with) three-dimensional object is designated by a URL (Uniform Resource Locator which is a string of characters indicating a server in a network). This method of designation corresponds to the designation of the absolute address of an AV data file in the network. On the other hand, in a system according to the MPEG scheme, each AV flow is identified by designating its ID. This corresponds to the designation of a relative path of a flow in a session (a communication line) when the section * has been established. That is, in the VRML there is no .do to identify a flow in addition to using a URL. But an application of, for example, a real-time MPEG communication requires designation based on the ID. There is a problem of incompatibility between the two schemes. When viewed from another point, it can be said that the VRML is a model in which a client requests information. On the other hand, the MPEG assumes a model in which the transmission information or the like is transmitted under the control of a server. The difference in these models causes a problem, which is difficult to mix a computer graphics image and a natural image while maintaining compatibility with VRML2.0.

SUMMARY OF THE INVENTION The present invention has been made in view of the aforementioned, and an objective of the invention is therefore to allow a computer graphics image, which is described according to VRML and an image or the like which compresses According to the MPEG scheme, they are transmitted in such a state that they are multiplexed in the same bitstream (data). In a method for producing modeling in three-dimensional space defined by a plurality of nodes and image / audio data specified by a position included in the nodes, the following steps are carried out: extracting a respective position of a node from the data of modeling in three-dimensional space; converting the extracted position to a flow ID that corresponds to the i / audio data associated with the position; replace the position with the flow ID; and multiplexing (or simultaneously transmitting) the image / audio data and modeling data in three-dimensional space including the stream ID, to produce a bit stream. According to one aspect of the present invention, modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML), Virtual Reality Modeling Language), the position is represented by the Uniform Resource Locator (URL), uniform resource locator) expressed in ASCII format, and the flow ID is expressed in binary format. According to another aspect of the present invention, the flow ID is converted into a character string, and it is determined whether the position of the image / audio data is replaced with the flow ID or the character string depending on whether The image / audio data is provided by a server or multiple servers.

BRIEF DESCRIPTION OF THE DRAWINGS Other objects and advantages of the present invention will be apparent from the following detailed description of the presently preferred embodiments thereof, the description of which should be considered in conjunction with the accompanying drawings in which: Figure 1 is a block diagram showing an example of the configuration of a first embodiment of the coding apparatus according to the present invention; Figure 2 shows a relation between a scene description? D and the nodes; Figure 3 shows an example of an ASCII format of a description of the scene for the union of a motion image as a texture to a node; Figure 4 shows an example of an ASCII format of a scene description for joining a still image as a texture to a node; Figure 5 shows an example of a binary format of one of a scene description for joining a moving image, such as a texture, to a node; Figure 6 shows an example of a binary format of one of a description of the scene for joining a moving image, such as a texture, to a node; Figure 7 shows an example of the detailed configuration of a multiplexer device as shown in Figure 1; Figure 8 is a block diagram showing an example of configuring a first embodiment of the decoding apparatus according to the invention; Figure 9 shows an example of the detailed configuration of a multiplexer circuit 404 as shown in Figure 8; Figure 10 shows an example of the configuration of a reconstruction circuit 411 shown in Figure 8; Figure 11 is a block diagram showing an example of detailed configuration of a synthesizer circuit shown in Figure 9; Figure 12 shows an example of an OD object descriptor; Figure 13 shows an example of an * ES_Descriptor "; Figure 14 shows an example of an * ES_ConfigParams"; Figure 15 is a block diagram showing an example of the configuration of a second embodiment of the coding apparatus according to the invention; Figure 16 shows an example of a binary format of a scene description to join a moving image as a texture to a node; Figure 17 shows an example of a binary format of a scene description for joining a still image, movement as a texture to a node; Figure 18 is a block diagram showing an example of the second mode configuration of the coding device according to the invention. invention; Figure 19 is a block diagram showing an example of the configuration of a third embodiment of the coding apparatus according to the invention; Figure 20 shows an example of a binary format of an SD scene description for joining a moving image, such as a texture; Figure 21 shows an example of a binary format of an SD scene description for joining a moving image, such as a texture; Figure 22 is a block diagram showing an example of the configuration of a third embodiment of the decoding apparatus according to the invention; Figure 23 is a block diagram for mapping the texture; Figure 24 is a block diagram showing an example of an MP @ ML encoder of the MPEG scheme; Figure 25 is a block diagram showing an example of an MP8ML decoder of the MPEG scheme; Figure 26 is a block diagram showing an example of a spatial scalability encoder; Figure 27 is a block diagram showing an example of a spatial scalability decoder.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The preferred embodiments of the present invention will be explained in detail with reference to the accompanying drawings. Figure 1 is a block diagram of a first embodiment of the coding apparatus according to the present invention. With reference to Figure 1, a control circuit of the system 301 receives a request or request signal (Request (REQ)), determined, referring to a description of the SD scene (the details will be described later) that is stored in a storage device 302, which AV object (three-dimensional object, natural image, sound, or the like) must be transmitted, and sends a scene request signal (Scene Request (SREQ)) to the storage device 302. The storage device 302 stores the SD scene description that describes a two-dimensional or three-dimensional scene. The description of the scene? D is described according to an ASCII format that complies with VRML2.0. a storage device 302 stores audio and video (AV) data bit streams (elementary streams (ES)) such as a moving image, still image and sound. A storage device 305 stores information (object flow info (01)) needed to decode the AV objects stored in the storage device 306. For example, the object flow information 01 is a buffer size necessary to decode a AV object, or a synchronization mark of each access unit. The flow information of object 01 includes all the AV bit stream information corresponding to the respective AV objects. A relation between a scene description, AV data (flows), and three-dimensional objects will be described below with reference to Figure 2. In the example of Figure 2, a rectangular image sequence and a triangular pyramid generated by graphics in computer are shown on the screen 352. Although in this example no texture is attached to the triangular pyramid, a texture can be attached to it as in the case of other three-dimensional objects. An added texture can be any still image or a moving image. The description of the SD 350 scene contains descriptions known as nodes. There is a SDO source node (root) that describes how to fix the objects in the whole image. An SDl node that is a child node of the SD parent node, describes information related to the triangular pyramid. An SD2 node that is also a child node of the SDO parent node, describes information related to the rectangular plane to which the images are to be joined. In figure 2, the signal of the image comprises three VO video objects (background, sun and person). The node SD2 describes information related to the background plane, the node SD3 describes information related to the rectangular plane for the union of the sun. The node SD4 describes information related to the plane for the union of the person. Each node describes a URL indicating an address of the corresponding AV data file (bitstream). The nodes SD3 and SD4 are child nodes of the node? D2. The description of the individual SD scene is a collection of all SD0-SD4 nodes. In the following, a collection of descriptions of all the nodes is called a scene description and the respective nodes are known as objects (two-dimensional or three-dimensional objects). Therefore, each node corresponds to a single two-dimensional or three-dimensional object. Each object corresponds, one by one, to an OD object descriptor that describes AV data (bit stream) related to the object. With relation to figure 1, a scan circuit 307 reads a URL (indicating the address of an AV data file) described in a node that is sent from the storage device 302 and a request signal (ST) is output to the storage device 306. Request (ESREQ)) to request the output of the AV data (bit stream corresponding to the URL. * 'Amas, the analysis circuit 307 sends, towards the storage di ... 305, a request signal 01 Request (OIREQ)) to request the output of the information from the object flow 01 describing information related to the AV data (bit stream) corresponding to the URL: An OD generation circuit (object descriptor) 304 receives the information of the flow of objects 01 related to an AV object that is sent from the storage device 305, and extracts, as an OD object descriptor, only AV data information (bitstream) that was requested by an OIREQ request signal and In addition, the OD generation circuit 304 generates an ID number OD_ID for each extracted OD object descriptor, registers it in the OD object descriptor and sends the resulting OD object descriptor to the multiplexer circuit 303 , and also sends the ID number generated ID ID to a BIFS encoder 308. The BISF encoder 308 converts the scene description of an ASCII format that is sent from the storage device 302 in a binary format, and replaces an included URL in the description of the SD scene with the ID number ID_ID that is sent from the generation circuit OD304. Then, the BISF decoder 308 sends the description of the scene B-SD that has been converted into a binary format and replaced with the ID number OD_ID towards the multiplexer circuit 303. The multiplexer circuit 303 multiplexes, in prescribed order, AV data ( bitstream) stored in the storage device 306, the description of the scene B-SD that has been converted into a binary format by the BISF encoder 308, and the OD object descriptors that have been generated by the generation circuit OD 304, and sends the multiplexed result as a bit stream of multiplexing FS. A detailed example of the multiplexing circuit 303 will be described below with reference to Figure 7. Next, the operation of the aforementioned embodiment will be described. When a user inputs, from an external terminal (not shown), a request signal to cause a certain AV object to be displayed, a request signal REQ is sent to a control circuit of scene 301. With the repetition of the REQ request signal, the scene control circuit 301 determines which AV object should be transmitted with reference to the description of the SD scene that is stored in the storage device 302 based on the request signal REQ, and sends a request signal from the SREQ scene to the storage device 302. Upon receipt of the request signal from the SREQ scene, the storage device 302 reads the description of the corresponding SD scene (described in an ASCII format) and the sends to the analysis circuit 307 and the BISF encoder 308. FIG. 3 shows an example of an SD scene description (described in the ASCII format) for joining a moving image as a texture. In this example, a URL indicating the address of the moving image file to be linked is described in the sixth line. Figure 4 shows an example of an SD scene description (described in the ASCII format) for joining a still image, such as a texture. In this example, a URL, which indicates the address of a still image file to be joined, is described in the second line. The formats of figures 3 and 4 comply with the description of the VRML node. The analysis circuit 307 reads a URL (indicating the address of the AV data file (bit stream)) included in a node that constitutes the scene description? D provided, and sends an ESREQ request signal to the storage device 306. As a result, the ES of the corresponding AV data (bitstream) is sent from the storage device 306 and provided to the multiplexer circuit 303. In addition, the circuit analysis 307 sends, to the storage device or 305, an OIREQ request signal to request the output of information from the object flow 01 related to the AV data ES (bit stream) indicated by the URL that is included in the node. As a result, the information of the object flow 01 corresponding to the URL is sent from the storage device 305 to the generation circuit OD 304. The generation circuit OD 304 extracts, as an OD object descriptor, only the information requested by the OIREQ request signal from the object flow information 01 related to the AV object that is sent from the storage device 305. In addition, the generation circuit OD 304 generates an ID number OD_ID, registers it in the OD object descriptor and sends the resulting OD object descriptor to the multiplexer circuit 303. Still further, the OD 304 generation circuit sends the ID number 0D_ID that has been generated for each OD object descriptor to the BISF 308 encoder. The BISF 308 encoder converts the description of the SD scene of an ASCII format that is sent from the storage circuit 302 to data (a scene description B-SD) of a format or binary by a predetermined method and replaces the URL included in the description of the scene? D. Then, the BISF encoder 308 sends the multiplexer circuit 303 the description of the scene B-SD that has been converted into the binary format. The details of the binary format are described * "the document known as MPEG4WD (document number - N1825) that has been standardized by the ISO, an example of the binary format will be described below: Figure 5 shows the data obtained by converting a scene description (ASCII format); see figure 3) to join a moving image, such as a texture, in a binary format. In Figure 5, "ObjectDescriptorID" appearing on line 29 is a flag indicating the ID number OD_ID of a moving image to be joined to this node.BISF encoder 308 writes the ID number OD_ID which is supplied from the generation circuit OD 304 in this part in the description of the scene B-SD that has been converted into the binary format.As a result, the address of the AV data (bitstream) that was described as a URL in a format ASCII becomes the ID number OD_ID (binary format) Figure 6 shows the data obtained by converting a scene description (ASCII format, see Figure 4) to join a still image, such as a texture, in a format binary In this example, "ObjetDescriptorID" appears on line 17 and the ID number OD_ID is written in this portion in the description of scene B-SD that has been converted into binary format The description of scene B- SD of a binary format like this The slot is provided to the multiplexer circuit 303. The multiplexer circuit 303 multiplexes, in the prescribed order, AV data (bit stream) stored in the storage device 306, the description of the scene B-SD that has been converted into a binary format. by the BISF encoder 308, and the OD object descriptors that have been generated by the generation circuit OD 304 and send the multiplexed bit stream FS. Figure 7 shows an example of a detailed configuration of the multiplexer circuit 303. In Figure 7 a start code generation circuit 303a generates and sends a start code indicating a starting position of a bit stream. The AV (bitstream) data ES1-ESN that is sent from the storage device 306 is supplied to the corresponding terminals. A description of scene B-SD in a binary format that is the result of the BISF encoder 308 and the OD object descriptors that are sent from the OD generation circuit 304 are sent to the corresponding terminals. In addition, the start code that is output from the start code generating circuit 303a is supplied to a corresponding terminal. The multiplexer circuit 303 operates a switch to make a connection to the terminal to which the start code generation circuit 303a is connected, so that it hereby sends the start code. Next, the switching is made to the terminal to which the description of the SD scene is entered, whereby the description of the SD scene is sent. Then, the switch is made to the terminal to which the OD object descriptors are introduced, thereby leaving the OD object descriptors. Finally, sequential switching is made, according to the data, to the terminals to which the AV data (bitstream) is input, by means of which the AV data (bitstream) ES1-ESN are sent . The multiplexer circuit 303 selects the start code, the SD scene description, the OD object descriptors and the AV data (bit stream) with the switch, and by this means sends these to the external system as a multiplexed bitstream FS . The multiplexed bit stream FS is sent to a receiving terminal via a transmission line, for example. An example of the configuration of a mode of a decoding apparatus corresponding to the coding apparatus of Figure 1 will now be described, with reference to Figure 8. Figure 8 is a block diagram showing an example of the configuration of a mode of a decoding apparatus according to the invention. In Figure 8, an ultiplexer circuit 404 receives a multiplexed bitstream FS and then separates and extracts the respective bit streams that constitute the multiplexed bitstream F ?. Figure 9 shows an example configuration of the demultiplexer circuit 404. As shown in Figure 9, the demultiplexer circuit 404 detects a start code in the SF multiplexed bitstream and recognizes a presence of the respective bitstreams. And then, a stream of introduced multiplexed SF bits is separated, with a switch, in a flow description SD [sic] and descriptors of OD objects leaving the corresponding terminals. In the same way, the bit streams ES1-E? N of AV data are separated and leave the corresponding terminals. Returning to Figure 8, an analysis circuit 406 receives the OD object descriptors that have been separated by the demultiplexer circuit 404, determines the class and number of decoders that are necessary to decode the AV data (bitstream) and makes that the bit streams of the respective AV data (bit streams) are sent to the corresponding decoders. In addition, the analysis circuit 406 reads the buffer capacities needed to decode the respective bitstreams of the OD object descriptors and supplies these (Init) to the respective decoders 407-409. Still further, to allow the determination as to which nodes the respective flows "" * its respective ES1-ESN the analysis circuit 406 prodc _JS numbers ID OD_ID of the respective object descriptors for the decoders which are for decoding the described bit streams in the respective OD object descriptors. The decoders 407-409 decode the bit streams according to a predetermined decoding method and corresponding to the encoding method, and send the resulting video data or audio / sound data to a reconstruction circuit 411. In addition, decoders 407-409 produce, for reconstruction circuit 411, ID numbers OD_ID which indicate to which nodes the respective decoded data belong (video data or audio data (sound)). if the received bitstream is data (SZ, POS) that indicate the size and position on the screen of the image and data (key data) that indicate the g For the penetration of the image included in the bit stream, the decoders 407-409 decode, from the bit stream, the data (SZ, POS) that indicate the size and position on the image screen (data of the size of the image). the image and the position on the screen) and data (key data) that indicate the degree of image penetration, and send this data to the reconstruction circuit 411. Although in the previous mode three decoders 407-409 are provided for a case where N is equal to 3, it must be understood that the number of decoders can change according to the data that will be processed. An analysis circuit 410 analyzes the description of the B-SD scene of the binary format and supplies the resulting data to the reconstruction circuit 411. In addition, the analysis circuit 410 reads the numbers of the ID OD_ID in the description of the scene B- D corresponding to the ID numbers OD_ID in the object descriptors and supplying them to the reconstruction circuit 411. Figure 10 shows a relationship between the bit streams for reconstructing a complete image and an example of the reconstruction circuit 411. As Shown in Figure 10, the reconstruction circuit 411 consists of a synthesizer circuit 351; and an image signal that is produced by the synthesizer circuit 351 is supplied to a display device 352 and the presentation of the image is made by this means. In figure 10, it is synthesizer circuit 351 and display device 352 are shown as reconstruction circuit 411. This is to show how the image that has been produced in synthesizer circuit 351 is displayed on screen device 351. the display device 352 is not included in the reconstruction circuit 209. The synthesizer circuit 351 receives the data of the node and the ID number 0D_ID that are supplied from the analysis circuit 410 and the image data, the key data, information of the size of the image and the position on the screen (SZ, POS), and the ID numbers OD_ID that are supplied from the decoders 407-409, capture the image data corresponding to the OD_ID, join the data of the image to the nodes based on the key data and the size and position information on the screen, and sends the image signals corresponding to the resulting image data to the device. display or screen 352. Figure 11 is a block diagram showing an example of reconstruction circuit 411. As shown in Figure 11, reconstruction circuit 411 comprises a coupling circuit 360, object synthesizer circuits 500-502 and a two-dimensional conversion circuit 503. The object synthesizer circuit 500 consists of a group of memories 500-1 and a presentation circuit 500-2. The memory group 500-1 consists of a 500-la texture memory, a 500-lb gray scale memory and a 500-lc three-dimensional object memory. For example, texture memory 500-stores AV data (bitstream) that is sent from decoder 407 as a textural data. The 500-lb gray scale memory stores key data indicating the degree of penetration that is provided from the decoder 407. The three-dimensional object memory 500-lc stores information of three-dimensional objects (nodes) that exit the analysis circuit 410. The information of the three-dimensional object (node) includes information of the formation of polygons, information of illumination for the illumination of the polygons and other information. The image size data and the screen position (SZ; POS) are also stored in a certain place, for example, in the gray scale memory, 500-lb. The presentation circuit 500-2 generates a three-dimensional object using polygons based on the node stored in the memory of the three-dimensional object 500-lc. In addition, the presentation circuit 500-2 receives the texture and the key data indicating the degree of penetration from the texture memory 500-la and the gray-scale memory 500-lb, respectively, binds the texture to the corresponding node and executes a process that corresponds to the key data so that the texture has the preselected transparency. The data thus obtained is sent to the two-dimensional conversion circuit 503. In addition, the data of the image size and the screen position (SZ, POS) are sent to the bidi insional conversion circuit 503. Since the object synthesizer circuits 501 and 502 are configured in the same way as the object synthesizer circuit 500, these will not be described here. If the texture (data of the imager unites (maps) the object, it is necessary to recognize the relationship between the texture and the object.) To identify the relationship, the ID numbers 0D_ID described in the OD object descriptors and the ID numbers are used. OD_ID described in the description of the scene B-SD.Therefore, the data that has been produced for the reconstruction circuit 411 is first supplied to the coupling circuit 360 before the data is sent to the circuit synthesizers of objects 500-502 The OD_ID ID numbers described in the OD object descriptors are coupled with the OD_ID ID numbers described in the description of the B-SD scene by a coupling circuit 360 as shown in FIG.; and the relationship is found by this means. The two-dimensional conversion circuit 503 converts, according to the information from the point of view that is supplied from the outside and the data of the size of the image and the position on the screen that are provided from the object synthesizer circuits, the objects attached to it. the texture that are produced from the respective synthesizer circuits of respective objects 500-502 in a two dimensional image signal through mapping in a two dimensional plane. The signal of the image is sent to the display device 352 to display it. Next, the operation of the previous mode will be described, with reference to FIG. 8. A multiplexed bit stream FS that has been transmitted through a transmission line is sent to the multiplexer circuit 404. The demultiplexer circuit 404 detects the code start in the multiplexed bit stream FS and also recognizes the bit streams. The demultiplexer circuit 404 separates a scene description B-SD, and the OD object descriptors, bit streams ES1-E? N corresponding to the AV data (bit stream) from the multiplexed bit stream F? and sends these by suitably switching the switch shown in Fig. 9. The OD object descriptors are supplied to the analysis circuit 406, the ES1-ESN bit streams are supplied to the respective decoders 407-409, and the description of the B-SD scene of binary format, is supplied to the analysis circuit 410. The analysis circuit 410 analyzes the description of the B-SD scene of binary format that leaves the demultiplexer circuit 404 and supplies a result (three-dimensional object information (node )) to the reconstruction circuit 411. In addition, the analysis circuit 410 decodes the OD_ID ID numbers of the OD object descriptors of the AV data (bitstream) to be attached to the nodes and supplies them to the circuit reconstruction 411. The analysis circuit 406 receives the OD object descriptors, recognizes the class and number of the decoders necessary to decode the bit streams and makes the Luxes of ES1-ESN bits are supplied to the respective decoders. In addition, the analysis circuit 406 reads the buffer capabilities or a synchronization flag of each access unit needed to decode the respective bit streams from the OD object descriptors, and supplies these as an initialization information (Init) to the respective decoders 407-409. As a result, the decoders 407-409 perform the initialization referring to the supplied values (the initialization information (Init)). Further, to indicate to which objects the bit streams that have been processed by the respective decoders 407-409 belong, the analysis circuit * 406 produces the ID numbers OD_ID of the respective object descriptors.

The decoders 407-409 perform the initialization as it can be by configuring a buffer according to the initialization information that is supplied from the analysis circuit 406. When the bit streams corresponding to the data are received. AV (bitstream) which are produced from the demultiplexer circuit 404, the decoders 407-409 decode the respective bit streams by a predetermined method corresponding to the encoding operation and produce video data or resulting audio (sound) data In addition, the decoders 407-409 send, to the reconstruction circuit 411, the ID numbers OD_ID which indicate to which object the bit streams that have been decoded by the respective decoders correspond. Even more, if the decoded bit stream is an image, the decoders 407-409 send data indicating the size and position on the screen of the image (SZ, POS) and data (key data) that indicate the degree of penetration of the image. As shown in Figure 11, the data that has been produced for the reconstruction circuit 411 is supplied to the corresponding object synthesizer circuits 500-502. An object synthesizer circuit corresponds to each node. As already described, when the different types of data are supplied to the corresponding object synthesizer circuits 500-502, it is necessary to find to which objects the bit streams that have been processed by the respective decoders 407-409 belong. Therefore, the OD_ID ID numbers described in the OD object descriptors are collated (coupled) by the coupling circuit 360 with the OD_ID ID numbers described in the description of the B-SD scene before the data is supplied to the circuits synthesizers of corresponding objects. With which, it is possible to recognize the relationship between the decoded signal (bit stream) and the information of the three-dimensional object (NODE). The object synthesizer circuits 500-502 receive the decoded signal including the ID numbers OD_ID which are indicated by the nodes from the decoders 407-409, respectively. If the decoded signal received is image data, the object synthesizer circuits 500-502 connect the image to a two-dimensional or three-dimensional object to be generated. The above operation will now be described for the object synthesizer circuit 500 which is used as an example. The texture data that is to be bound to the object is stored in the 500-la textures memory. The key data and the ID ID OD number are sent to the 500-lb gray scale memory and stored there. The node (information of the three-dimensional object) is stored in the memory of three-dimensional objects 500-lc. In addition, the image size and screen position data (SZ, POS) are also stored in a certain location, for example, 500-lb gray scale memory. The ID number OD_ID is used to recognize the node. The presentation circuit 500-2 reads the node (information of the three-dimensional object) that is stored in the memory of three-dimensional objects 500-lc and generates a corresponding object through the use of polygons. In addition, the presentation circuit 500-2 joins the image data that is received from the texture memory 500-la to the polygons generated above in relation to the key data indicating the degree of penetration that is received from the memory of the gray wing 500-lb. In addition, the data of the size of the in ', n and the position on the screen (SZ, POS) are read from the memory of the 500-lb gray scale and are supplied to the two-dimensional conversion circuit 503. Similar operations are carried out in the object synthesizer circuits 501 and 502. The two-dimensional conversion circuit 503 is supplied with the two-dimensional or three-dimensional objects attached to the texture from the object synthesizer circuits 500-502. Based on the information from the point of view that is supplied from the outside and the data of the size of the image and the position on the screen (SZ, PCS), the 503 two-dimensional conversion circuit converts three-dimensional objects into a two-dimensional image signal by mapping in a two-dimensional plane. The three-dimensional objects that have been converted into the signal of the two-dimensional image are output (shown) on the display device 352. If all the objects are two-dimensional, the outputs of the respective presentation circuits 500-2 to 500-2 are combined as these are according to their degree of penetration (key data), and then they come out. In this case, no conversion is made. Figures 12-14 show structures of an OD object descriptor. Figure 12 shows the entire structure of the OD object descriptor. In Figure 12, "Nodeld" in the third line is a 10-bit flag that indicates the ID number of this descriptor, and corresponds to the ID number.

OD_ID mentioned above. The * streamCount "aspect in the fourth line is an 8-bit flag that indicates the number of AV data units (ES bitstream) included in the OD object descriptor.

* ES_Descriptor "which are needed to decode the respective ES bit streams are transmitted in a number indicated by" strea Count. "The element wextensionFlag" in the fifth line is a flag that indicates whether other information is transmitted. If the value of this flag is * 1", other descriptors are transmitted ES_Descriptor" in the octave line is a descriptor that indicates the information related to each bit stream. Figure 13 shows details of the ES_Descriptor. "In Figure 13 * ES_nu? - ber" in the third line is a 5-bit flag indicating an ID number for the identification of the bitstream. The * StreamType "element in the sixth line indicates the format of the bit stream and, for example, is an 8-bit flag that indicates this data as a video, PEG2.The * QoS_Descriptor" element is an 8-bit flag that indicates a request to a network in a transmission. The "ESConfigParams" element in the eighth line is a descriptor that describes information necessary to decode the bit stream, and its details are shown in Figure 14. The details of ?? ConfigParams are described in the MPEG4 system. In the above embodiment, in the decoder apparatus, a URL that is included in a node that constitutes modeling data in three-dimensional space (VRML data) is replaced by the ID number OD_ID of an OD object descriptor corresponding to the AV data (flow of bits) that are designated by the URL On the decoding side, an object descriptor OD is searched for that corresponds to the ID number 0D_ID that is included in the node, whereby the data is detected (recognized) AV (bit stream) of the corresponding data Thus, it becomes possible to transmit a GC image and a natural image that are multiplexed in the same stream while the description method of a scene and a three-dimensional object is maintained compatible with, for example, the VRML scheme. In the above embodiments, the encoded audio and video data (AV data (bitstream)) are stored in the storage device 306. However, for example, these can be input directly from an audio or video encoding device without passing for this storage device. Although in the above embodiments the AV data (bit stream), the OD object descriptors and a description of the SD scene are stored in separate storage devices, these can be stored in the same storage device or recording medium. In addition, although a description of the SD scene is stored in advance as a file, the AV data (bit stream) and the information of the flow of objects 01 can be generated in real time at the time of transmission. Next, a second embodiment of the coding apparatus according to the invention will be described with reference to Figure 15. In Figure 15, the portions having the corresponding portions in Figure 1 receive the same reference symbols as the latter and will not be described. In this embodiment, a URL change circuit 309 is added to the mode of Figure 1. The output data of the analysis circuit 307 and the output of the generation circuit OD 304 are supplied to the URL change circuit 309 and then the date ie output of the URL changer circuit 309 are supplied to the BISF encoder 308. The remaining configuration is the same as in the embodiment of Figure 1. The URL changer circuit 309 converts the ID number OD_ID that is produced from the circuit generation OD 304 in a corresponding character string of an ASCII format, and then send it. For example, an example description will be made in which the object-flow information. * I, needed to decode the AV data (bit stream) that is to be attached to a certain node stored in the storage device 302, has the following address. http: //serverA/AV_scenel/object_file.1 (1) In this case, the information of the object flow 01 is read from the storage device 305, and the ID number OD_ID of an OD object descriptor corresponding to the information of the object flow 01 is supplied from the generation circuit OD 304. The URL exchange circuit 309 receives the ID number OD_ID and rewrites (changes) the URL to a suitable character string of the ASCII format, for example, if the OD_ID is "4" the expression (1) is rewritten (changed) to the next one. mpeg: // 4 (2) Where a string of "rnpeg" characters is at the head of a character string indicating a URL and a string of characters (in this example, the character ") indicates a number that is located immediately after of the character string *: // "following * pmeg" indicates the ID number OD_ID There may be a case where a URL described in a node that is stored in the storage device 302 designates an existing file in the device encoder (in the network) that is different from the encoding apparatus of figure 15. In this case, the URL changing circuit 309 interrupts the conversion operation and the URL of expression (1) for example, is supplied, as such, to the BISF encoder 308. The operation of this mode will be briefly described below When a REQ request signal is received, the scene control circuit 301 determines which AV object should be transmitted in relation to the description of the SD scene it is stored in the storage device 302 based on the request signal REQ, and sends a request signal of the scene SREQ to the storage device 302. When the request signal of the scene SREQ is received, the storage device 302 reads a corresponding scene description? D (described in ASCII format) and supplies it to the analysis circuit 307 and to the BISF encoding 308. The analysis circuit 307 reads a URL (indicating the address of an AV data file (flow of bits)) included in an npdo that constitutes the scene description SD supplied, and sent to the storage device 306 a SREQ request signal for the output of an AV data (bit stream) corresponding to the URL. As a result, the ES of the corresponding AV data (bitstream) occurs in the storage device 306 and is sent to the multiplexer circuit 303. In addition, the analysis circuit 307 produces, for the storage device 305, a signal of OIREQ request to request the output of the information of the object flow 01 related to the ES of the AV data (bitstream), indicated by the URL that is included in the node. As a result, the information of the object flow 01 corresponding to the URL leaves the storage device 305 and is supplied to the generation circuit OD 304. Furthermore, the analysis circuit 3Q7 sends the URL change circuit 309, the URL that is included in the node. The OD 304 generation circuit extracts, as an OD object descriptor, only the object flow information requested by the OIREQ from the object flow information 01 related to the AV object that is supplied from the storage circuit 305. In addition , the generation circuit OD 305 generates an ID number OD_ID registers it in the OD object descriptor and sends the resulting OD object descriptor to the multiplexer circuit 303. Furthermore, the OD 304 generating circuit sends the ID number OD_ID that has been generated for each OD object descriptor to the URL exchange circuit 309. If the URL that has been supplied from the 307 analysis circuit designates an existing file on another server in the network, the 309 URL exchange circuit sends, as such as the URL to the BISF encoder 308. If the supplied URL designates an AV data file (bit stream) stored in the storage device 306, the URL changer circuit 309 generates a string of characters such as that of expression (2) by referencing the ID number OD_ID which is the output of the generation circuit OD 304 and transfers the character string to the BISF encoder 308. The encoded BI? F 308 converts the description of the scene? D of an ASCII format which is supplied from the storage device 302 into a scene description B-SD of a binary format by a predetermined method and replaces the URL included in the description of the SD scene with the URL or character string supplied from the generation circuit of the OD 304. The generation circuit OD 304. After this, the description of the scene B-SD 1 binary format is transferred to the multiplexer circuit J03. Figure 16 shows an example of a scene description? D in binary format for joining a moving image as a texture. A URL on line 29 is a string of characters of an ASCII format that is transferred from the URL exchange circuit 309. That is, in this mode, a UR is described as a string of characters in binary format. Figure 1 * .. shows an example of a binary format of a scene description? D to join a still image as a texture. As in the case of Figure 16, a URL on line 17 of Figure 17 is a string of characters of an ASCII format. The SD scene descriptor that has been converted to a binary format by the BIFS encoder 308 is supplied to the multiplexer circuit 303 and multiplexed with the OD object descriptors and the ES of the AV data (bitstream). The multiplexed bitstream FS leaves the multiplexer circuit 303. The multiplexed bitstream F? it is supplied to the decoder apparatus through a transmission line, for example. Next, referring to Figure 18, a description of a mode of decoding apparatus corresponding to the coding apparatus of Figure 15 will be made. Figure 18 is a block diagram showing a second embodiment of the decoding apparatus according to the invention. . In Figure 18, the portions corresponding to the portions of Figure 8 have the same reference symbols as the latter and will not be described. In the embodiment of FIG. 18, a conversion circuit of the URL 412 is added to the embodiment of FIG. 8. In addition, an analysis circuit 410 supplies information that is expressed as a string of ASCII format characters. The retentive configuration is the same as in the embodiment of Figure 8. The URL changing circuit 412 converts the information expressed as a string of ASCII format characters into the ID number OD_ID which is the ID of an OD object descriptor. corresponding and supplies it to the reconstruction circuit 411. The operation of this mode will be briefly described below. The URL that has been extracted from a node by the analysis circuit 410 is sent to the conversion circuit of the URL 412. If the URL is a string of characters having, for example, an expression format (2), the conversion of the URL 412 converts the character string to the ID number OD_ID and supplies it to the reconstruction circuit 411. As a result, the reconstruction circuit 411 joins the corresponding AV data, such as a texture, to the node based on the ID number OD_ID included in the node. However, if the extracted URL designates a file that is stored on another server in the network (the URL is a string of characters that has, for example, an expression format (1)), the 412 URL changer circuit supplies the information to the demultiplexer circuit 404 and then the demultiplexer circuit 404 issues a file transmission request to this server. As a result, the multiplexed bit stream FS 'is transmitted by executing a similar process, and a presentation operation is performed. According to the previous modality, even if the ES of the AV data (bit stream) is to be joined to a node that exists in another server in the network, the desired AV (bitstream) data can be acquired and displayed. Next, the description will be made, referring to Figure 19, of a third embodiment of the coding apparatus according to the invention. Figure 19 is a block diagram showing the third embodiment of the coding apparatus, according to the invention. In Figure 19, those portions having corresponding portions in Figure 1 take the same reference symbols as the latter and will not be described. In the embodiment of FIG. 19, a URL exchange circuit 309, a switch 310 and a control circuit 311 are added to the embodiment of FIG. 1. In addition, the output data of the analysis circuit 307 and the number of the ID 0D_ID of the generation circuit OD 304 are sent to the URL changing circuit 309. The output data of the URL changing circuit 309 and the ID number OD_ID of the generation circuit OD 304 are sent to the switch 310, and the circuit of control 311 controls switch 310. The remaining configuration is the same as in the embodiment of figure 1. The URL changer circuit 309 converts the ID number OD_ID leaving the generation circuit OD 304 into a corresponding character string in ASCII format and transfer it. Since the operation of the URL changing circuit 309 was described in the second embodiment of FIG. 15, it will not be described.Controlled by the control circuit 311, the switch 310 selects one of the ID number 0D_ID leaving the generation circuit OD 304 or the URL exiting the exchange circuit of the URL 309 and transfer the selected OD_ID or URL to the BIFS 308 encoder. control circuit 311 controls the switching of switch 310 according to the type of application, for example. Then the operation of this mode will be described briefly. The URL whose format has been converted by the URL changing circuit 309 (the details are explained in the second embodiment and will not be described here) is supplied to the switch 310. In the same way, the ID number OD_ID that has been sent from the generation circuit OD 304 is supplied to the switch 310. The connection of the switch 310 changes under the circuit control of with * 1 311. For example, for real-time communication or hardware di ... .., it is advantageous that the ID number OD_ID is directly described as a number in the form of, for example, a 10-bit flag instead of a string of characters. Therefore, in this application, the switch 310 is controlled by the control circuit 311 to select the output data of the generation circuit OD 304, in which case the ID number OD_ID is recorded in a scene description B-SD of binary format using the BIF encoder? 308. The AV (bitstream) data that is designated by a URL is stored on another server in the network, the control circuit 311 controls the switch 310 to change its connection so that the output data of the URL changer circuit 309 are selected, whereby the URL exits and is registered in the BIFS encoder 308. On the other hand, in the case of an application in a computer, it is advantageous that a flow is designated by a URL of a string of characters, to the high degree of flexibility. Therefore, in this application, the switch 310 is controlled to make a connection to the URL changing circuit 309, whereby a URL is registered in a B-SD scene description of the binary format by the BIFS 308 encoder. 20 shows an example of a binary format of a scene description B-? D to join an er-movement image, such as a texture. In Figure 20, "isString" on lines 29 and 30 is a 1-bit flag that indicates whether the ID number OD_ID or a URL is described. If this value is * 0", the ID number 0D_ID of 10 bits is registered in the node.If the value of" isString "is" l ", a URL is registered The URL is a string of characters that has been rewritten by the URL changer circuit 309 to indicate the ID number 0D_ID of a moving image that is to be joined to the node.

Figure 20 shows an example of a binary format of a scene description B-? D for joining a moving image as a texture. In this figure, as in the previous case, "isString" on lines 17 and 18 is a 1-bit flag that indicates whether the ID number OD_ID or a URL is described. A multiplexed stream FS that has been encoded by the above encoding apparatus is transmitted to the decoding apparatus via a transmission line. Fig. 22 is a block diagram showing the third embodiment of the decoding apparatus corresponding to the coding apparatus of Fig. 19, according to the invention. In Figure 22, those portions having corresponding portions in Figure 8 receive the same reference symbols as the latter and will not be described further. In the embodiment of Figure 22, a conversion circuit of the URL 412 is added to the embodiment of Figure 8. The remaining configuration is the same as in the embodiment of Figure 8. In this embodiment, the analysis circuit 410 decodes "isString" If this value is * 1", the analysis circuit 410 supplies a URL to the conversion circuit of the URL 412. If this value is * 0", the analysis circuit 410 decodes the ID number OD_ID and supplies a result to the circuit 411 reconstruction. If the URL is described in the form of, for example, expression (2), the URL conversion circuit 412 decodes the ID number OD_ID and produces a result for the reconstruction circuit 411. If the URL indicates an existing file on another server, the information is sent to the demultiplexer circuit 404 and the demultiplexer circuit 404 access this server and read the desired file. The operation of this mode will be briefly described below. A description of the scene SD (node) read is supplied to the analysis circuit 410 and analyzed therein. The analyzed scene description is supplied to the reconstruction circuit 411. Furthermore, the analysis circuit 410 decodes the "isString" and judges whether its value is * 1. If this value is judged "1", the analysis circuit 410 supplies the URL conversion circuit 412 with a URL of the AV data (bit stream) to be attached, such as a texture, to the node If the URL is described in the form of, for example, the expression (2) (ie the header of the character string is * mpeg4"), the URL conversion circuit 412 decodes the ID number OD_ID which is the ID of an OD object descriptor of the character strings and sends it to the circuit 411. If the URL designates a file that exists on another server, the information is supplied to the demultiplexer circuit 404 and the demultiplexer circuit 404 accesses this server, requests the server to transfer the desired file and receives it. Even when communication is made with a plurality of servers, each server operates in the same way as that already described. On the other hand, if the "isStrins" is * 0", the analysis circuit 410 decodes the ID number OD_ID and produces a result for the reconstruction circuit 411. The remaining operation is the same as in the first mode and does not it will be described next. According to the above embodiment, the most suitable coding method can be selected according to the type of application. According to the invention, a recording means, such as a d; ?, DVD-R, CD-R, CD-ROM, etc., contains encoded imag v signals generated by the coding method, as already noted; and these encoded image signals are decoded when they are produced from the recording medium. Although the apparatus and method of coding and decoding, according to this invention, have been shown with respect to the block diagrams, in addition to providing different physical elements for each block, the method and apparatus can be implemented in a multi-purpose computer (general) programmed for this use. In this sense, the recording medium or other storage device may contain operant instructions (source codes of the program or software) to perform each of the steps established in the methods for coding and decoding operations as mentioned above. It should also be noted that instead of the recording medium, a transmission channel connected to a communication network or the like (for example, the Internet, digital satellite, etc.) can be provided to receive and transmit data from an encoder and to decode the encoded data. The apparatus and method for encoding and decoding according to the invention can be employed to encode and decode information from a digital video disk, a database of images, compression and image expansion units, an image downloaded from Internet or software modules that implement these systems, for example. In the coding apparatus, the coding method and the recording medium, the modeling data in three-dimensional space (VRML data) is entered and the data (AV data stream) is also entered. A location indication data (URL) included in a node is extracted from the modeling data in three-dimensional space (VRML data) entered. The extracted location indication (URL) data is converted into a flow ID corresponding to the data (AV data stream) designated by the information of the indication of URL (URL). The node indication data (URL) of the node is replaced by the flow ID obtained by the conversion. The modeling data in three-dimensional space (VRML data) obtained by the substitution and the AV data are multiplexed in the same flow. Therefore, it becomes possible to transmit an object that is described as modeling data in three-dimensional space (VRML data) and a natural image that is compressed according to, for example, the MPEG scheme in a state that these are multiplexed therein. flow. In the decoding apparatus, the decoding method and the recording medium, the nodes are extracted from the multiplexed data and the data (AV data (bitstream)) are extracted from the multiplexed data. The information that indicates a correlation between the nodes and the data (AV data (bitstream)) is extracted from the nodes. The nodes are collated (coupled) with the data (AV data (bitstream)) based on the extracted information indicated by the correlation. Nodes and data (AV data (bitstream)) are combined based on a result of the correlation. Therefore, it becomes possible to decode data that has been transmitted in a state in which an object is described as modeling data in three-dimensional space (VRML data) and a natural image that is compressed according to, for example, the schema MPEG, are multiplexed in the same data flow. In this way, it can be observed that the object previously established, between the obvious ones from the previous description, are obtained with efficiency and, since certain changes can be made to carry out the previous method and in the established construction without departing of the spirit and scope of the invention, it is proposed that any aspect contained in the foregoing description and shown in the accompanying drawings be construed as illustrative and not in a limiting sense. It should also be understood that the following claims are proposed to cover all the generic and specific features of the invention described herein; and all statements of the scope of the invention that, as a language aspect, can be said to fall within it.

Claims

CLAIMS 1. A method for producing modeling data in three-dimensional space defined by a plurality of nodes and image / audio data specified by a position included in the nodes, the method consists of: extracting, a respective position of a node, the modeling data in space tridi: 'j-ional; converting the extracted position into a flow ID corresponding to the image / audio data associated with the position; replace the position with the flow ID; and multiplexing the image / audio data and modeling data in three-dimensional space, including the flow ID, so as to produce a bitstream. The method 3 according to claim 1, wherein the bit stream * -. Ncludes additional information including the stream ID defining image / audio data, this additional information having been multiplexed with image / audio data and modeling data in three-dimensional space in the bitstream. The method according to claim 1, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML, Virtual Reality Modeling Language), the position is represented by the Uniform Resource Locator (URL) ), Uniform Resource Locator) expressed in ASCII format, and the flow ID is expressed in binary format. 4. The method, according to claim 3, further consists of converting the flow ID into a character string, and determining whether to replace the position of the image / audio data with the flow ID or the character string depending whether the image / audio data is supplied by the same server or by different servers. The method, according to claim 4, wherein the bit stream includes additional information including the stream ID and defining image / audio data, the additional information having been multiplexed with image / audio data and the modeling data in three-dimensional space in the bitstream. 6. A method for producing modeling data in three-dimensional space defined by a plurality of nodes and image / audio data specified by a position included in the nodes, the method consists in: extracting a respective position of a node from the modeling data in three-dimensional space; converting the extracted position into a flow ID corresponding to the image / audio data associated with the position; convert flow ID into a string of characters; replace the position with the string of characters; and multiplexing the image / audio data and modeling data into three-dimensional space including the character string such that a bitstream is produced. The method according to claim 6, wherein the position is replaced with the character string depending on whether the image / au io data is supplied by the same server or different servers. 8. The method according to claim 6, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML), Virtual Reality Modeling Language), the position is represented by the Uniform Resource Locator (URL) , Uniform Resource Locator) expressed in ASCII format, and the flow ID is expressed in binary format. The method according to claim 6, wherein the bit stream includes additional information including the stream ID and defining the image / audio data, the additional information having been multiplexed with the image / audio data and the modeling data in three-dimensional space, in the bitstream. 10. An apparatus for producing modeling data in three-dimensional space by means of a plurality of nodes and image / audio data specified by a position included in the nodes, the apparatus contains: the means for extracting a respective position, of a node, from the data of modeling in three-dimensional space; means for converting the extracted position into a flow ID corresponding to the image / audio data associated with the position; the means to replace the position with the flow ID; and means for multiplexing the image / audio data and the modeling data in three-dimensional space, including the flow ID, so as to produce a bitstream. The apparatus according to claim 10, wherein the bit stream includes additional information including the stream ID and defining image / audio data, this additional information having been multiplexed with the image / audio data and data of modeling in three-dimensional space, in the bitstream. 12. The apparatus, according to claim 10, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Lenguage (VRML, Virtual Reality Modeling Language), the position is represented by the Uniform Resource Locator (URL), Uniform Resource Locator) expressed in ASCII format, and the flow ID is expressed in binary format. The apparatus, according to claim 12, further comprises the means for converting the flow ID into a character string, and the means for determining whether the position of the image / audio data is replaced with the flow ID or the character string 'depending on whether the image / audio data is supplied by the same server or by different servers. The apparatus, according to claim 13, wherein the bitstream includes additional information including the stream ID and defining image / audio data, the additional information having been multiplexed with image / audio data and data of modeling in three-dimensional space, "* the flow of bits 15. An apparatus"? to produce modeling data in three-dimensional space defined by a plurality of nodes and image / audio data specified by a position included in the nodes, the apparatus comprises: means for extracting a respective position, of a node, from modeling data in three-dimensional space; means for converting the extracted position into a flow ID corresponding to the image / audio data associated with the position; the means to convert the flow ID into a string of characters; the means to replace the position with the string of characters; means for multiplexing the image / audio data and modeling data in three-dimensional space, including the character string, so as to produce the bitstream. 16. The apparatus according to claim 15, wherein the position is replaced with the character string depending on whether the image / audio data is supplied by the same server or different servers. 17. The apparatus, according to claim 15, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML), Virtual Reality Modeling Language), the position is represented by the Uniform Resource Locator (URL ), uniform resource locator) expressed in ASCII format, and the flow ID is expressed in binary format. The apparatus, according to claim 15, wherein the bit stream includes additional information including the stream ID and defining the image / audio data, the additional information having been multiplexed with the image / audio data and the modeling data in three-dimensional space in the bitstream. 19. A method for producing a bit stream including image / audio data and modeling data in three-dimensional space, comprising a plurality of nodes, to produce a presentation image, the method consists of: receiving a bit stream; demultiplexing the received bitstream in a deflow ID, modeling data in three-dimensional space and image / audio data; and providing a correspondence between the image / audio data and a respective node according to the flow ID so that a presentation image is produced. 20. The method, according to claim 19, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML), and the information of the flow ID is expressed in binary format. The method, according to claim 19, wherein the correspondence between the image / audio data and the node information is, according to the flow ID, expressed by a first expression or a corresponding character string. to the flow ID expressed by a second expression; and wherein the node includes a flag to indicate whether the first or second expression has been used depending on whether the image / audio data has • been supplied by the same server or different servers. The method, according to claim 19, wherein the bitstream includes additional information including the stream ID and defining image / audio data, the additional information having been multiplexed with the image / audio data, and modeling data in three-dimensional space in the bitstream; and where the flow ID included in the node is checked against the flow ID included in the additional information. 23. A method for processing a bit stream including image / audio data and modeling data in three-dimensional space comprising a plurality of nodes to produce a presentation image, the method consists of: receiving a stream the bit stream that includes image / audio data and modeling data in three-dimensional space comprising the nodes, and producing modeling data in three-dimensional space and image / audio data; converting the information of the character string into the information of the flow ID, the information of the character string being information indicating a correlation between a node and the image / audio data; and linking the image / audio data and the node according to the information of the converted flow ID. 24. The method according to claim 23, wherein the information indicating the correlation is the information of the character string corresponding to the information of the flow ID or information designating the position; wherein the image / audio data is linked to the node according to the information of the converted flow ID information, if the information indicating the correlation is information of ca. The characters, and the image / audio data, are linked from a provider portion designated by the position designation information for the node in case the information indicating the correlation is information designating the position. 25. The method of claim 23, wherein the modeling data in three-dimensional space is described by means of the Virt 1 Reality Modeling Language (VRML), the information of 1-character adena is expressed in ASCII format, and the information of the converted flow ID is expressed in binary format. The method, according to claim 23, wherein the bitstream includes information that defines image / audio data and includes the stream ID; and wherein the information of the converted flow ID is collated with the information of the flow ID included in the information defining the image / audio data, and the image / audio data is linked to the node in accordance with the result of the comparison. 27. An apparatus for processing a bit stream including image / audio data and modeling data in three-dimensional space comprising a plurality of nodes to produce a presentation image, the apparatus comprising: the means for receiving the bit stream; the means for demultiplexing the received bit stream in a stream ID, the modeling data in three-dimensional space and the image / audio data; means for providing a correspondence between the image / audio data and a respective node, according to the flow ID, so as to produce a presentation image. 28. The apparatus according to claim 27, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML) and the information of the flow ID is expressed in binary format. 29. The apparatus according to claim 27, wherein the correspondence between the image / audio data and the node information is in accordance with the flow ID expressed by a first expression or a string of characters corresponding to the ID. of flow expressed by a second expression; and wherein the node includes a flag to indicate whether the first or second expression has been used depending on whether the image / audio data has been supplied by the same server or different servers. 30. The apparatus according to claim 27, wherein the bit stream includes additional information including the stream ID and defining image / audio data, the additional information having been multiplexed with the image / audio data, and the modeling data in three-dimensional space, in the bitstream; and where the flow ID included in the node is checked against the flow ID included in the additional information. 31. An apparatus for processing a bit stream including image / audio data and modeling data in three-dimensional space comprising a plurality of nodes to produce a presentation image, the apparatus comprising: the means for receiving the bit stream that includes image / audio data and modeling data in three-dimensional space comprising the nodes; the means for producing modeling data in three-dimensional space and image / audio data; the means for converting the character string information into information of the stream ID, the information of the character string being information indicating a correlation between nodes and the image / audio data; and the means for linking the image / audio data and the node according to the information of the converted flow ID. 32. The apparatus according to claim 31, wherein the information indicating the correlation is character string information corresponding to the information of the flow ID or position designation information.; wherein the image / audio data is linked to the node according to the information of the converted stream ID, if the information indicating the correlation is character string information, and the image / audio data is linked from the provider portion designated by the position designation information with the node in the event that the information indicating the correlation is position designation information. 33. The apparatus according to claim 31, wherein the modeling data in three-dimensional space is described by the Virtual Reality Modeling Language (VRML), the information of the character string is expressed in ASCII format, and the information of the converted flow ID is expressed in binary format. 34. The apparatus according to claim 31, wherein the bit stream includes information defining image / audio data and including the stream ID; and wherein the information of the converted flow ID is collated with the information of the flow ID included in the information defining the image / audio data and the image / audio data are linked to the node according to the result of the collation. 35. A recording medium having registered therein a data producing program for producing modeling data in three-dimensional space comprising a plurality of nodes and image / audio data designated by position designation information included in the nodes of the modeling data in three-dimensional space, the data producing program being executed to perform the steps of: extracting the designation formation of the position, included in a node, of the models of modeling in three-dimensional space; converting the desingnation information of the extracted position into information of the flow ID corresponding to the image / audio data designated by the designation information of the extracted position; replace the designation information of the position included in the node with information of the flow ID; and multiplexing the image / audio data and data-- of modeling in three-dimensional space that includes information of the flow ID, so as to produce a bit stream. 36. The recording medium according to claim 35, wherein the data producing program is further executed to perform the steps of: converting the information of the flow ID into character string information expressed by a first expression; and determining whether the designation information of the position included in the node is replaced with the information of the flow ID expressed by a second expression or with character string information expressed by the first expression, wherein the information designation of the Position included in the node is replaced according to the determined result. 37. A recording medium having registered in this a data producing program for producing modeling data in three-dimensional space comprising a plurality of nodes and image / audio data designated by position designation information included in the nodes of the modeling data in three-dimensional space, the data producing program being executed to perform the steps of: extracting designation information from the position, included in a node, of the modeling data in three-dimensional space; converting the extracted designation information into information of the flow ID corresponding to the image / audio data designated by the designation information of the extracted position; converting the flow ID information into character string information; replace the designation information of the position included in the node with character string information; and multiplexing the image / audio data and modeling data into three-dimensional space included in the character string information so as to produce a bitstream. 38. A recording medium having registered therein a data processing program for producing a presentation image from a bit stream including image / audio data and modeling data in three-dimensional space comprising a plurality of nodes, the data processing program is executed to perform the steps of: receiving the bitstream including the image / audio data and modeling data in three-dimensional space comprising a plurality of nodes; produce modeling data in three-dimensional space and image / audio data; and linking the image / audio data with a node according to the information indicating a correlation between the node and the image / audio data, the information of the flow ID being information indicating the correlation. 39. The recording medium, according to claim 38, wherein the information indicating the correlation is the information of the flow ID expressed by a first expression or string information corresponding to the information of the flow ID , this string information being expressed by a second expression; and wherein the node includes flag information indicating the first or second expression of the information indicating the correlation, the data processing program being further executed to perform the steps of: determining an expression of the information indicating the correlation of agreement with the flag information; converting the information of the character string, expressed by the second expression, into information of the flow ID expressed by the first form of expression, wherein the image / audio data is linked to the node according to the information of the Flow ID if the information indicating the correlation is information of the flow ID, and the image / audio data is linked to the node according to the information of the converted flow ID if the information indicating the correlation is information of character strings. 40. A means of registration, which ti-. 'registered therein, a data processing program for producing a presentation image from a stream including modeling data in three-dimensional space comprising a plurality of nodes and image / audio data, the data processing program it comprises the steps of: receiving the stream including modeling data in three-dimensional space comprising a plurality of nodes and image / audio data, and transferring the modeling data in three-dimensional vacuum and the image / audio data; converting the information of the character string into information of the flow ID, the information of the character string being information indicating a correlation between a node and image / audio data; and linking the image / audio data with the node according to the information of the converted flow ID. 41. A recording medium having recorded therein, a bitstream including data modeling three-dimensional space comprising a plurality of nodes and image data / audio designated by the designation information of the position, included in nodes , of the modeling data in the three-dimensional space, the bit stream being prepared by the steps of: extracting position designation information, included in a node, from the modeling data in three-dimensional space; converting the extracted position designation information into information of the flow ID corresponding to the image / audio data designated by the designation information of the extracted position; replace the designation information of the position included in the node with information of the flow ID; and multiplexing the image / audio data and modeling data into three-dimensional space, including the data of the stream ID, so as to produce the bitstream. 42. The recording medium, according to claim 41, wherein the bitstream is further prepared with the steps of: converting the information of the flow ID into character string information expressed by a first expression; and determining whether the designation information of the position included in the node is replaced with the flow ID expressed by a second expression or with character string information expressed by the first expression, wherein the position designation information , included in the node, is replaced according to the determined result and the information representative of the determined expression is inserted in the place of the information replaced in the node. 43. A recording medium having recorded thereon a bitstream including data modeling three-dimensional space comprising a plurality of nodes and image data / audio information designated by position designation included in nodes the modeling data in three-dimensional space, the bit stream being prepared by the steps of: extracting designation information of the position, included in a node, of the modeling data in three-dimensional space; converting the designation information of the extracted position into information of the flow ID corresponding to the image / audio data designated by the designation information of the extracted position; convert the information of the flow ID into information of strings of characters; replace the position designation information, included in the node, with character string information; and multiplexing the image / audio data and modeling data into three-dimensional space including the character string information so as to produce a bit stream. 44. A recording medium manufactured by a producing device, the recording medium having recorded thereon a signal having a data stream which includes modeling data in three-dimensional space comprising a plurality of nodes and image data / audio, the registration means having the registered signal processed by the steps of: receiving the data stream including modeling data in three-dimensional space comprising the nodes and image / audio files; transfer the modeling data into three-dimensional space and image / audio data; and linking the image / audio data to a node according to the information indicating a correlation between the node and the image / audio data, the information of the flow ID being information indicating the correlation. 45. The recording medium, according to claim 44, wherein the information indicating the correlation is the information of the flow ID expressed by a first expression or information of strings corresponding to the information of the flow ID , the character string information expressed by a second expression; and wherein the node includes flag information indicating an expression of the information indicating the correlation, the means * ie record having the recorded signal further processed by the steps of: determining an expression of the information indicating the correlation according to the flag information; and converting the information of the character string, expressed by the second expression, into information of the flow ID, expressed by the first expression, wherein the image / audio data is linked to the node according to the information - the Flow ID, if the information indicating the correlation is information of the flow ID, the image / audio data being linked to the node according to the information of the converted flow ID if the information indicating the correlation is string information. 46. A recording medium producible by means of a production device, the recording medium having therein registered a signal having a flow that includes modeling data in three-dimensional space comprising a plurality of nodes and image / audio data, having recording means the recorded signal processed by the steps of: receiving the stream including modeling data in three-dimensional space comprising the nodes and image / audio data; transfer the modeling data into three-dimensional space and image / audio data; converting the character string information into information of the stream ID, the information of the character string being information indicating a correlation between a node and image / audio data; and linking the image / audio data with the node according to the information of the converted flow ID. 47. An apparatus for providing modeling data in three-dimensional space defined by a plurality of nodes and image / audio data specified by a position included in the nodes, the apparatus comprises: an analysis circuit for extracting a respective position from a node of the modeling data in three-dimensional space; a converter for converting the extracted position into a flow ID corresponding to the image / audio data associated with the position; an encoder to replace the position with the flow ID; a multiplexer for multiplexing the image / audio data and the modeling data in three-dimensional space including the stream ID, so as to produce a bit stream 48. An apparatus for providing modeling data in three-dimensional space by a plurality of nodes and image / audio data specified by a position included in the nodes, comprises: an analysis circuit for extracting a respective position of a node from the modeling data in three-dimensional space; a converter for converting the extracted position into a flow ID corresponding to the image / audio data associated with the position; a changer circuit for converting the flow ID into character string information; an encoder to replace the position with the character string; and a multiplexer for multiplexing image / audio data and modeling data in three-dimensional space including the character string, so as to produce a bitstream. 49. An apparatus for processing a bit stream including image / audio data and modeling data in three-dimensional space comprising a plurality of nodes, to produce a presentation image, the apparatus comprising: a demultiplexer for receiving the bitstream and to demultiplex the received bitstream into a stream ID, modeling data into three-dimensional space and image / audio data, and a reconstruction circuit to provide a correspondence between the image / audio data and a respective node in accordance with the flow ID, so that the presentation image is produced. 50. An apparatus for processing a bit stream including image / audio data and modeling data in three-dimensional space comprising a plurality of nodes, to produce a presentation image, comprises: a demultiplexer for receiving the bit stream that includes the image / audio data and modeling data in three-dimensional space comprising the nodes and for transferring modeling data into three-dimensional space and image / audio data; a converter for converting the information of the character string into information flow ID, the information of the character string being information indicating a correlation between a node and the image / audio data; and a reconstruction circuit for linking the image / audio data in the node according to the information of the converted flow ID.