WO2001015017A1

WO2001015017A1 - Video data structure for video browsing based on event

Info

Publication number: WO2001015017A1
Application number: PCT/KR2000/000969
Authority: WO
Inventors: Jung Min Song; Jin Soo Lee
Original assignee: LG Electronics Inc
Current assignee: LG Electronics Inc
Priority date: 1999-08-26
Filing date: 2000-08-25
Publication date: 2001-03-01
Anticipated expiration: 2002-02-26
Also published as: KR20010019343A; KR100319158B1; AU6738500A

Abstract

A video data structure based on events for a video browsing system is disclosed. The present video data structure allows an implementation of a video browser in which a user can easily browse and display a video based on events which are significant in relations between characters, and variable relations between characters, or a relation between character and place. Also, the present video structure allows the events to be displayed as summarized data of corresponding video segments, thereby enabling an efficient and user-friendly browsing of a video.

Description

VIDEO DATA STRUCTURE FOR VIDEO BROWSING BASED ON EVENT

BACKGROUND OF THE INVENTION Field of the Invention

The present invention relates to a video browsing system, and more particularly to a video data structure for video browsing based on event, in which a period satisfactory for event conditions based on semantic data may be summarized and displayed. Background of the Related Art Typically, users simply view movies and/or dramas as broadcasted through a TV or played at a movie theater. However, a user may wish to view a particular movie or drama at a particular time, or wish to view only a particular section of a movie or a drama. Accordingly, various techniques which enables a selective watching of a movie/drama or sections of a movie/drama have been suggested.

In the related art, for example, various video data may be represented or classified into format chunk, index chunk, media chunk, segment chunk, target chunk, and/or representation chunk. Also, data on various characters or objects such as a name of an object, position on the screen, numeric data with relation to a segment of the video data in which the object appears, may be represented by the target and representation chunk. Accordingly, a user can select an object through a table and reproduce for display a particular section where the object is shown in the video .

In another related art, various additional data of a video data are obtained before, during or after the production of the video data. Thereafter, an additional information table of the obtained data is composed and provided to users. Namely, the additional data table may include a position where an actor appears, a position where a character of the actor appears, and a position where stage properties appear, such that a scene can be reproduced as selected by a user through the additional data table. For example, if a user selects a stage property, information on the selected stage property such as the manufacturer and price may be displayed on a screen, and the user may be able connect with the manufacturer or a seller of the stage property through a network connection. In still another related art, recording information on each section of a video m a video map has been suggested. That is, information such as the degree of violence, the degree of adult contents, the degree of importance of contents, characters positions, and the degree of difficulty in understanding may be indicated for each section of a video in the video map. Thus, the user may set a degree of preference for one or more items of the video map, and only sections of the video meeting the set degree of preference would be reproduced, thereby limiting a display of particular contents to unauthorized viewers. Other techniques m the related art which allow users to selectively view a portion of a video include a temporal relational graph of shots for a video. However, viewing a temporal relational graph is similar to viewing several representative scenes of a video and would not allow a user to easily follow the contents of a video. Similarly, other techniques in the related art as described above provide items simply arranged without any relation to the objects appearing in the movie or drama, based upon the selection of the user. However, the contents of a movie or drama generally builds around relations between characters, places and events. For example, relations between characters may not change from beginning to the end of the story or may continuously vary. Moreover, since one or more characters relate to a specific character in the movie or drama, the browsing method in the related art substantially fails to provide an accurate understanding of the story of the movie or drama to the user.

Therefore, techniques in the related arts have disadvantages in that it is difficult to understand a video centering on relations among characters according to the development of events, changes of relations, and relations among characters and places as events develop.

SUMMARY OF THE INVENTION

Accordingly, an object of the present invention is to solve at least the problems and disadvantages of the related art.

Another object of the present invention is to provide a more efficient video browsing system. A still another object of the present invention is to provide a video data structure based on events significant to constant and variable relations between characters, or relations between character and place. A further object of the present invention is to provide a video data structure for a video browser m which contents of an event is summarized and displayed.

Additional advantages, objects, and features of the invention will be set forth m part in the description which follows and m part will become apparent to those having ordinary skill m the art upon examination of the following or may be learned from practice of the invention. The objects and advantages of the invention may be realized and attained as particularly pointed out in the appended claims. To achieve the objects and in accordance with the purposes of the present invention, as embodied and broadly described herein, a video data structure for video browsing based on event includes a syntactic description scheme (DS) of actual video segments, a semantic DS for describing the semantic data, and a visualization DS for displaying summary of entire or a segment of a video data.

In the above video data structure, the semantic DS has an event DS describing event data, wherein the event DS has linking data to link to the visualization DS for displaying summary data of a selected event and linking data to link to segments of the syntactic data for displaying actual video segments of the selected event.

In the present video data structure, the semantic DS includes at least one event/object relation graph data for describing a relation among object, place, and event; a constant relation between characters; or variable relations between characters. The event/object relation graph data may include linking data to link at least one object with at least one event to display a relation among object, place and event, and an object type for identifying whether each object is a place or character. Also, the event/object relation graph data may include linking data to link at least one object with at least one place and at least event to display a relation among object, place and event. Moreover, the event/object relation graph data may include two or more objects, a relation name and a relation type to display a constant relation between characters, a variable relation between characters, and a relation with an event which corresponds to either a constant or variable relation.

In the present video data structure, the visualization DS is used in displaying a summary of an entire video or a video segment, and includes at least one of a segment linking data for linking segments to be used in a successive display to display the summary data, a key frame data for summarizing and displaying key frame based video data, and a highlight data for summarizing and displaying video data as a video highlight. Also, in the video data structure according to the present invention, the highlight data may have multi-levels depending on a degree of detail m the summarized data, and summarized data corresponding to each level includes a segment for linking segments which will be used n the highlight. If the highlight data has multi-levels, summarized data corresponding to each level may include a time data of a period which will be used m the highlight.

Furthermore, m the video data structure according to the present invention, the event DS links the semantic DS with the visualization DS, and includes linking data for linking highlight data to summarize a specific event. The event DS may further include one or a combination of linking data for linking the key frame data to summarize a specific event, linking data for linking at least one time data corresponding to a summarized period describing a specific event, linking data for linking at least one segment corresponding to a summarized period describing a specific event, and at least one time data corresponding to a summarized period describing a specific event.

Moreover, m the video data structure according to the present invention, the time data may be separated from the time data of the highlight data within the visualization DS data.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described m detail with reference to the following drawings in which like reference numerals refer to like elements wherein: Fig. 1 is a video data structure based on events m accordance with the present invention;

Figs. 2a and 2b show examples of a highlight DS m the video data structure based on events in accordance with the present invention; Figs. 3a to 3e show examples of a structure which links a semantic DS with a visualization DS n the video data structure n accordance with the present invention;

Fig. 4 shows a video browser based on the video data structure n accordance with the present invention; and Fig. 5 shows another video browser based on the video data structure m accordance with the present invention.

DETAILED DESCRIPTION

Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings.

Figs. 1 - 3 show examples of a video data structure according to the present invention. Generally, the video data structure of Fig. 1 ~ 3 is based on significant events of a video and supports a video browsing system based on content. A video browser based on content in disclosed in co-pending application ?, and is fully incorporated herein.

Particularly, the video data structure according to present invention links relations between objects and changes in the relations between objects with corresponding objects and events of a video. For purposes of explaining the present invention, an object will generally be assumed to be a character or place in a video. The video data structure allows browsing of an event based on relations between characters or relations between characters and places. Referring to Fig. 1, a visual description structure (DS) 101 is organized into a visualization DS 102, a syntactic structure DS 103, and a semantic structure DS 104.

The visualization DS 102 is used for displaying a summary of either an entire or segment plot of a video, and includes a highlight data or a key frame data. That is, the visualization DS 102 is organized into at least one reference to segments 113 which links video segments to be displayed, a key frame view DS 114 which is used for displaying summary video data based on key frames, and a highlight view DS 115 which is used for displaying summary video data as a video highlight.

The highlight view DS 115 may be configured as a highlight view DS 201 shown in Fig. 2a or a highlight view DS shown in Fig. 2b. Generally, a plot of a video, whether for an entire video or a video segment, may be summarized briefly or summarized with greater amounts of detail.

Accordingly, the highlight view DS 201 is organized into a level 202 which has multiple levels of highlight data based upon a degree of detail in summarizing a video. Here, the highlight data corresponding to each level includes a segment 203 which links to video segments to be displayed as a video highlight.

Similarly, the highlight view DS 204 is organized into a level 205 which has multiple levels of highlight data based upon a degree of detail in summarizing a video. Here, the highlight data corresponding to each level includes a time DS 206 which is a time period used in displaying a video highlight. Referring back to Fig. 1, the syntactic structure DS 103 is used for displaying the actual video and includes actual video segments to be displayed. The syntactic structure DS 103 is organized into actual video segment data of segment DS 105, and corresponding time DS 106 which is a temporal position of the segment DS 105 m a video.

The semantic structure DS 104 includes additional information describing a video, and is organized into an event DS 107 which represent event information, an object DS 108 which represents object information, and at least one event/object relation graph DS 109 which represents relation information. Namely, the event DS 107 describes events, and the object DS 108 describes objects such as characters and places. The event/object relation graph DS 109 describes a constant or changes in relations between characters, relations between objects and places, or relations between objects and events. Moreover, the event/object relation graph DS 109 may include information of one or more relations.

Here, a constant relation means either a relation between characters that cannot change throughout a video, such as a parent to child relation, or a relation which is most representative among variable relations between characters. The event DS 107 includes a reference data for linking to the visualization DS 102 to display summary data corresponding to a selected event, and a reference data for linking to segment DS 105 m syntactic structure DS 103 to display actual video segments corresponding to the selected event. That is, when an event is selected to display a corresponding video segment of a video, the event DS 107 includes a reference to segment 110 which links to actual video segments of the video corresponding to the selected event, a reference to visualization 111 for displaying summary of the actual video segments corresponding to the selected event, and an annotation DS 112 for explaining the selected event.

In the above video data structure, the notation above each data such as {0,1}, {0,*}, or {1,*} indicates the number of data for the corresponding data. For example, the notation of {0,1} for the visualization DS 102 indicates that the visual DS 101 can have zero or one visualization DS . On the other hand, the notation of {0,*} for the segment DS 112 indicates that the syntactic structure DS 103 may have from zero to any number of segment DS .

Furthermore, as the reference to visualization 111 links the semantic structure DS 104 with the visualization DS 102, the reference to visualization 111 may include at least one of the information shown in Figs. 3a ~ 3e. Particularly, Fig. 3a shows a reference to highlight view

DS 301 which links the event DS 107 to the highlight view DS 115 for displaying the video segment corresponding to the selected event by a highlight of the video segment. A reference to key frame view DS 302, shown m Fig. 3b, links the event DS 107 to key frame view DS 114 for displaying the video segments corresponding to the selected event by key frames. A reference to time DS 303, shown m Fig. 3c, links the event DS 107 to one or more time DS 106 of video segments corresponding to the selected event. If the highlight view DS 114 includes the time DS 206, as shown in Fig. 2b, the reference to time DS 303 links to one or more time DSs 206 of the video segments corresponding to the selected event.

Fig. 3d shows a reference to segment DS 304 which links the event DS 107 to one or more segments DSs 105 corresponding to video segments which describe the selected event. If the highlight view DS 114 includes the segment 203, as shown in Fig. 2a, the reference to segment DS 304 links to one or more segments 203 corresponding to the video segments which describes the selected event.

Finally, a time DS 305, shown m Fig. 3e, is directly used when the highlight view DS 114 includes the time DS 206 as shown in Fig. 2b. The time DS 305 represents temporal data for a video segment which describes a selected event. Also, the video data structure of Fig. 3e directly includes the corresponding time DSs rather than linking the time DSs 206 of Fig. 2c. As described above, m the video data structure according to the present invention for video browsing, one or more relations include data for linking object (s) with event (s) to display a video segment showing a relation among objects, places and events. At this time, an object is determined to be a place or a character by an object type. Also, an object includes data for linking one or more objects with one or more places and events to display a relation among objects, places, and events. Alternatively, an object may include two or more objects, a relation name and a relation type to display a constant relation between characters, the variable relations between characters, and a relation with an event which shows the constant and variable relations.

Namely, the video browser of the present invention allows a user to select one or more relations from a relation graph between an object and a place, and displays events which describe the selected relations. At this time, if an event is selected from the displayed events, summaries of video segments, such as highlight or key frame data, corresponding to the selected event would be displayed to summarize the corresponding event. Here, a user may designate whether to display a summary of a video segment by highlight or key frame data, or a browsing system may pre-designate the form of display.

Also, the video browser of the present invention allows a user to select relations between characters from a graph showing a constant relation between characters and variable relations between characters. At this time, events which show the selected relation (s) are displayed. If an event is selected from the displayed events, summary of video segments corresponding to the selected event, such as a highlight or key frame data, would be displayed to summarize the corresponding event. As discussed above, a user may designate whether to display a summary of a video segment by highlight or key frame data, or a browsing system may pre-designate the form of display.

In the video browser of the present invention, events may be displayed by a key frame or a text data which describes the event, or a combination of the key frame and the text data. Moreover, when a summary of video segment of an event is displayed by highlight, the video browser displays the highlight for a period of the time DSs 206 corresponding to the summary of video segment .

Fig. 4 shows an example screen of a video browser implemented by the present video data structure, in which a video can easily be understood and browsed based on constant and variable relations between characters n a video. Referring to Fig. 4, a video browser includes a character relation screen, a main screen, and a main scene screen. The character relation screen displays main characters of a video on a character screen 401, and displays characters having relations with a character selected from the character screen 601 on a relation screen 402. Here, the relation between the selected character and related characters are displayed by a tree structure, where a constant relation is placed on a top level of the tree while variable relations are placed on a lower level of the tree. Also, the displayed constant relation may include additional information such as a number of variable relations in the lower tree structure. For example, m the displayed constant relation between 'character 1' and 'character 2, ' the number '2' displayed above 'character 2' indicates that there are two variable relations in the tree structure.

Furthermore, events significant in a constant relation or a variable relation selected from the relations screen 402 are displayed by key frames on a main scene screen 403. Here, events significant m a relation may mean events which show the corresponding relation or events which brought about a change m a corresponding relation. A highlight of an event corresponding to a selected key frame from the mam scene screen 403 can then be displayed on a main screen 404. In Fig. 4, a user selected 'character 1' from among the characters in the character screen 401 and other characters 'character 2' ~ 'character n' related with 'character 1' are displayed on the relation screen 402 by a tree structure. As 'relation 2' with 'character 2' is selected from the relation screen 402, significant events corresponding to 'relation 2' with 'character 2' are displayed on the main scene screen 403 as key frames. Also, a summary video section data, i.e. highlight, of 'event 6' selected from the main scene screen 603 can be reproduced and displayed on the main screen 404, according to commands input through a user interface 405.

The implementation of the video browser of Fig. 4 using the data structure of Figs. 1 - 3 will next be explained.

The main characters are displayed on the character screen 401 based upon the object DS 108. When 'character 1' is selected, a relation between ' character 1 ' and other characters having constant or variable relations with 'character 1' are displayed on the relation screen 402 based on the event/object relation graph DS 109. When a variable 'relation 2' with 'character 2' is selected from the relation screen 402, key frames representing events significant to 'relation 2' with 'character 2' is displayed on the main scene screen 402 based upon the reference to segment 110. Finally, when an 'event 6' is selected from the main scene screen 403, a key frame data or a highlight of 'event 6' can be displayed on the main screen 404 based upon the reference to visualization 111. At this time, the key frame data or highlight may be displayed on the mam screen 404 based respectively upon the key frame DS 114 or the highlight view DS 115 of the visualization DS 102.

Fig. 5 shows another example screen of a video browser implemented by the present video data structure, in which a video can easily be understood and browsed based on object-place relations. A video browser based on an object-place relations is disclosed in co-pending U.S. Patent Application Serial No. 09/239,531, entitled "Contents-Based Video Story Browsing System," and is fully incorporate herein.

Referring to Fig. 5, a video browser includes a key frame screen and a story screen. The key frame screen displays a relation graph between main characters and places by key frames on a relation screen 501 and significant events of a relation selected from the relation screen 501 are displayed on a text screen 502 by key frames with brief annotations. Thus, a highlight of a significant event selected from the relation screen 501 can be reproduced and displayed on a main screen 503. In the example embodiments of Figs. 4 and 5, a video segment corresponding to an event is summarized and displayed by a highlight as described above. Alternatively, however, a video segment corresponding to an event may be summarized and displayed using other data such as key frames.

Also, the object DS 108 in Fig. 1 includes information which is used in displaying objects, such as a character or a place. In such case, an object type may be included in the object DS 108 to determine whether an object is a place or character. Alternatively, to separate the information used in displaying an object and/or a place, the semantic structure DS 103 may include the object DS 108, the event DS 107, and a place DS and an object/place/event relation graph DS rather than the event/object relation graph DS 109.

As discussed above, the video data structure according to the present invention for video browsing based on events has the following advantages. As the present video data structure includes semantic data, summary data, and linking data, an event m a video can be browsed using highlight or key frames, based on a relation between characters or a relation between objects and a places, thereby allowing a user-friendly browsing system. Furthermore, in the present invention, events can be browsed based on factors such as character and place that significantly act on development of a story in a movie or drama. Accordingly, a portion of a video can be easily selected and displayed by a user to provide an effective video browsing.

The foregoing embodiments are merely exemplary and are not to be construed as limiting the present invention. The present teachings can be readily applied to other types of apparatuses. The description of the present invention is intended to be illustrative, and not to limit the scope of the claims. Many alternatives, modifications, and variations will be apparent to those skilled m the art.

Claims

WHAT IS CLAIMED IS:

1. A video data structure for a video browsing system comprising : a visualization DS which includes summary data of at least one video segment of a video; a syntactic structure DS which includes information for displaying actual video segments of the video; and a semantic structure DS which includes additional information describing the video, wherein the semantic DS includes at least one event DS which includes event information and wherein each event DS includes data for linking to the syntactic structure DS to display actual video segments of a selected event and data for linking to the visualization DS to display a summary data of a selected event.

2. A data structure of claim 1, wherein the visualization DS comprises either one or a combination of at least one reference to segments which links the visualization DS to the syntactic structure DS to display a summary data of a selected event; at least one highlight view DS which includes highlight data for displaying a summary data of a selected event; and at least one key frame view DS which includes key frame data for displaying a summary data of a selected event.

3. A data structure of claim 2, wherein each highlight view DS is organized into multiple levels of highlight data based upon a degree of detail included, and a highlight data corresponding to each level includes a segment data which links to video segments which will be used in displaying a highlight of the selected event.

4. A data structure of claim 2, wherein each highlight view DS is organized into multiple levels of highlight data based upon a degree of detail included, and a highlight data corresponding to each level includes a time data which is a time period used in displaying a highlight of the selected event.

5. A data structure of claim 1, wherein the syntactic structure DS is organized into at least one segment DS including actual video segment data and a time DS for each segment DS including temporal positions of a corresponding video segment data within a video data.

6. A data structure of claim 1, wherein the semantic structure DS further comprises: at least one object DS which includes object information; and at least one event/relation graph DS which includes relation information.

7. A data structure of claim 6, wherein each event DS comprises : at least one reference to segment which includes data for linking to the syntactic structure DS to display actual video segments of a selected event; and at least one reference to visualization which includes data for linking to the visualization DS to display a summary data of a selected event.

8. A data structure of claim 7, wherein each event DS further comprises an annotation DS for explaining a selected event .

9. A data structure of claim 7, wherein the visualization DS comprises either one or both of at least one highlight view DS which includes highlight data for displaying a summary data of a selected event, and at least one key frame view DS which includes key frame data for displaying a summary data of a selected event.

10. A video data structure of claim 9, wherein each reference to visualization comprises:

(a) at least one reference to highlight view DS which links the event DS to the highlight view DS for displaying a highlight of a selected event; or

(b) at least one reference to key frame view DS which links the event DS to the key frame view DS for displaying a summary data of a selected event by key frames; or (c) at least one reference to time DS which links the event DS to one or more time periods of actual video segments corresponding to a selected event; or

(d) at least reference to segment DS which links the event DS to one or more actual video segments corresponding to a selected event; or

(e) at least one time DS which is a time period used in displaying a highlight of a selected event; or

(f ) a combination of (a) to (e) .

11. A data structure of claim 6, wherein each event/object relation graph may include data on one of a constant relation between objects, a variable relation between objects, a relation between objects and places, or a relation between objects and events .

12. A data structure of claim 1, wherein the semantic structure DS comprises: at least one event DS which includes event information of the video; at least one object DS which includes object information of the video; at least one place DS which includes place information of the video at least one object/place/event relation graph DS which includes relation information.

13. A video browsing system comprising: a character screen which displays characters of a video; a relation screen which displays constant relations and variable relations between characters of the video by a tree structure; a main scene screen which displays significant events corresponding to one of either a constant relation or variable relation selected from the relation screen; and a main screen which displays a summary data of at least one video segment corresponding to a significant event selected from the main scene screen.

14. A system of claim 13, wherein said summary data is one of either a video highlight or key frames.

15. A system of claim 13, wherein the character screen, the relation screen, the main scene screen, and the main screen are displayed based on a video data structure comprising: a visualization DS which includes summary data of at least one video segment of a video; a syntactic structure DS which includes information for displaying actual video segments of the video; and a semantic structure DS which includes additional information describing the video, wherein the semantic DS includes at least one event DS which includes event information and wherein each event DS includes data for linking to the syntactic structure DS to display actual video segments of a selected event and data for linking to the visualization DS to display a summary data of a selected event.

16. A system of claim 15, wherein the visualization DS comprises either one or a combination of at least one reference to segments which links the visualization DS to the syntactic structure DS to display a summary data of a selected event, at least one highlight view DS which includes highlight data for displaying a summary data of a selected event, and at least one key frame view DS which includes key frame data for displaying a summary data of a selected event.

17. A video browsing system comprising: a relation screen which displays a relation graph between main characters and places of a video by key frames; a text screen which displays significant events of a relation selected from the relation screen; a main screen which displays a summary data of at least one video segment corresponding to a significant event selected from the relation screen.

18. A system of claim 17, wherein the text screen displays said significant events by key frames with brief annotations.

19. A system of claim 17, wherein said summary data is one of either a video highlight or key frames.

20. A system of claim 17, wherein the relation screen, the text screen, and the mam screen are displayed based on a video data structure comprising: a visualization DS which includes summary data of at least one video segment of a video; a syntactic structure DS which includes information for displaying actual video segments of the video; and a semantic structure DS which includes additional information describing the video, wherein the semantic DS includes at least one event DS which includes event information and wherein each event DS includes data for linking to the syntactic structure DS to display actual video segments of a selected event and data for linking to the visualization DS to display a summary data of a selected event.