[go: up one dir, main page]

EP1320992A2 - Procede de mise en evidence d'information importante dans un programme video par utilisation de reperes - Google Patents

Procede de mise en evidence d'information importante dans un programme video par utilisation de reperes

Info

Publication number
EP1320992A2
EP1320992A2 EP01971992A EP01971992A EP1320992A2 EP 1320992 A2 EP1320992 A2 EP 1320992A2 EP 01971992 A EP01971992 A EP 01971992A EP 01971992 A EP01971992 A EP 01971992A EP 1320992 A2 EP1320992 A2 EP 1320992A2
Authority
EP
European Patent Office
Prior art keywords
cue
video clip
preselected
frames
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP01971992A
Other languages
German (de)
English (en)
Inventor
Mohamed Abdel-Mottaleb
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Publication of EP1320992A2 publication Critical patent/EP1320992A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/785Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/7857Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using texture

Definitions

  • the present invention relates to content-based video retrieval and browsing, and niore particularly, to a method for automatically identifying important information or developments in video clips of sports events.
  • Video applications call for browsing methods which enable one to browse through a large amount of video material to find clips which are of a certain importance.
  • Such applications may include for example, interactive TV and pay-per-view systems.
  • Customers who use interactive TV and pay-per-view systems want to see sections of programs before renting them.
  • Video browsers enable the customers to find programs of interest.
  • low-level features such as color, texture, shape and camera motion.
  • low-level features can be useful for certain applications, many other interesting applications require the use of higher level semantic information. Bridging the gap between low-level features and high-level semantic information is not always easy. In most cases when higher level semantic information is required, manual annotation using keywords is always used.
  • One of the important applications for video archiving and retrieval is for sports such as soccer, football, etc. Accordingly, a method is needed which enables automatic extraction of high level information using low level features.
  • the present invention is directed to a method for automatically identifying important developments in video clips of sporting events, especially soccer matches.
  • the method comprises detecting sequences of frames in a video clip of a sporting event that have a preselected cue indicative of a possible important development in frames of the video clip immediately preceding the frame sequences having the preselected cue; comparing the number of frames in each of the frame sequences having the cue to a predefined threshold number; and declaring an important development in the frames immediately preceding each frame sequence if the number of frames in that sequence is equal to or greater than the threshold number.
  • the method further involves acquiring the preselected cue from low level features in the image in each frame of the sequence.
  • the preselected cue is based on changes in the camera's center of attention. More particularly, when an important development occurs in the video clip, the camera typically focuses on the viewers or players, and thus, the images in the sequence of frames immediately subsequent to the frames with the important development have little or no grass areas.
  • Fig. 1 is a flowchart outlining an algorithm that performs an illustrative embodiment of the method of the present invention
  • Fig. 2 is a block diagram of a computer for implementing the present invention.
  • Fig. 3 is a block diagram of the internal structure of the computer for implementing the present invention.
  • the method of the present invention extracts high level information from multiple images or video using low level features in order to achieve advancements in content-based retrieval and browsing. This is accomplished in the present invention by specifying a particular domain of interest and using knowledge specific to that domain to automatically extract high level information based on low level features.
  • One especially useful application for the present invention is in highlighting segments of important developments in video clips of sports events, including but not limited to soccer matches and football games. Such video clips typically include video, audio, and textual (close- captioning) information.
  • the method of the present invention highlights important developments in a video clip by inferring the developments from one or more cues which are provided from low level features and textual information of the video clip. More particularly, the method detects sequences of frames in the video clip having a certain preselected visual, audible, and/or textual (close captioning) cue. The number of frames in each sequence having the cue(s) is then compared to a predefined threshold number. If the number of frames in a sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence with the cue. It has been found that important developments in video clips of sports events are typically marked with a visual cue which relates to changes in the camera's center of attention.
  • the video camera usually focuses on the stadium viewers or the players.
  • the camera When the camera focuses on the viewers or players, little or none of the grass of the playing field can be seen in the camera' s field of view.
  • the method of the present invention detects sequences of frames in the video clip with images that have little or no grass areas of the playing field.
  • the number of frames in each sequence is compared to a predefined threshold number. If the number of frames in the sequence is equal to or greater than the threshold number, an important development is declared in the frames immediately preceding the threshold meeting frame sequence that has little or no grass areas.
  • the threshold is based on the assumption that if the number of frames in the sequence with little or no grass areas of the playing field is significant, the camera must be focusing on the viewers or the players. Consequently, it is likely that the frames immediately preceding that sequence of frames includes an important development such as the scoring of a goal in the case of a soccer match.
  • Fig. 1 shows a flowchart which outlines an illustrative embodiment of an algorithm for performing the method of the present invention as it applies to highlighting segments of important events in a video clip of a soccer match.
  • the algorithm in step SI detects sequences of frames in the video clip in which there are little or no grass areas.
  • step 52 if the number of frames in the sequence is larger than a predefined threshold, then in step
  • the algorithm detects green areas which have colors similar to grass.
  • the algorithm is trained to differentiate the green colors from the other colors in each frame so that the grass areas in the frame can be identified. This is accomplished using patches from a training set of images of grass areas which have been extracted from the soccer match in the video clip, or from one or more previous soccer matches.
  • the algorithm learns from the patches how the grass areas translate into the values of the color green. Given an image in a frame of the video clip, the training is used to judge whether a given pixel in the frame is grass.
  • a color histogram of an image is obtained by dividing a color space, such as red, green, and blue, into discrete image colors (called bins) and counting the number of times each discrete color appears by traversing every pixel in the image.
  • This normalized histogram can be considered as the probability density function for the class grass, p(pixel value I grass).
  • the detection step SI is accomplish in the algorithm by marking pixels in each frame that have a value of p(pixel value
  • step S2 If only small grass color components are detected for a short period of time in step S2, for example in only one-three or four frames, then no important event is declared in step S3. However, if small grass color components are detected for a relatively long period of time, for example in 200-300 frames, then an important event is declared in step S3.
  • the results obtained with the algorithm can be further refined using other cues either from the same modality or from other modalities, such as audio or closed captions. Cues from the same modalities or different modalities can be used to confirm the identity of the detected important occurrences or activities and more importantly, to classify the detected important occurrences or activities into semantic classes, such as goals, attempted goals, penalties, injuries, fights between players and the like, and rank them by importance.
  • the method of the Fig.ure 1 is implemented by a computer readable code executed by a data processing apparatus.
  • the code may be stored in a memory within the data processing apparatus or read/downloaded from a memory medium such as a CD-ROM or floppy disk.
  • hardware circuitry may be used in place of, or in combination with, software instructions to implement the invention.
  • the invention for example, can also be implemented on a computer 30 shown in Fig. 2.
  • the computer 30 may include a network connection 31 for interfacing to a data network, such as a variable-bandwidth network or the Internet, and a fax/modem connection 32 for interfacing with other remote sources such as a video or a digital camera (not shown).
  • the computer 30 may also include a display for displaying information (including video data) to a user, a keyboard for inputting text and user commands, a mouse for positioning a cursor on the display and for inputting user commands, a disk drive for reading from and writing to floppy disks installed therein, and a CD-ROM drive for accessing information stored on CD-ROM.
  • the computer 30 may also have one or more peripheral devices 38 attached thereto inputting images, or the like, and a printer for outputting images, text, or the like.
  • Fig. 3 shows the internal structure of the computer 30 which includes a memory 40 that may include a Random Access Memory (RAM), Read-Only Memory (ROM) and a computer-readable medium such as a hard disk.
  • the items stored in the memory 40 include an operating system 41, data 42 and applications 43.
  • the operating system 41 may be a windowing operating system, such as UNIX; although the invention may be used with other operating systems as well such as Microsoft Windows95.
  • the applications stored in the memory 40 include a video coder 44, a video decoder 45 and a frame grabber 46.
  • the video coder 44 encodes video data in a conventional manner
  • the video decoder 45 decodes video data which has been coded in the conventional manner.
  • the frame grabber 46 allows single frames from a video signal stream to be captured and processed.
  • the CPU 50 comprises a microprocessor or the like for executing computer readable code, i.e., applications, such those noted above, out of the memory 50.
  • applications may be stored in memory 40 (as noted above) or, alternatively, on a floppy disk in disk drive 36 or a CD-ROM in CD-ROM drive 37.
  • the CPU 50 accesses the applications (or other data) stored on a floppy disk via the memory interface 52 and accesses the applications (or other data) stored on a CD-ROM via CD-ROM drive interface 53.
  • Input video data may be received through the video interface 54 or the communication interface 51.
  • the input video data may be decoded by the video decoder 45.
  • Output video data may be coded by the video coder 44 for transmission through the video interface 54 or the communication interface 51.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Television Signal Processing For Recording (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé de mise en évidence de développements importants dans un vidéo-clip d'événement sportif, tel que le vidéo-clip d'un match de football, par déduction de la présence de ces développements à partir d'un repère fourni par des caractéristiques de bas niveau dans le vidéo-clip. Le procédé détecte, dans le vidéo-clip, des séquences de trames contenant certains repères visuels ou audibles présélectionnés. Le nombre de trames dans chaque séquence comportant le repère est alors comparé à un seuil prédéfini. Si le nombre de trames dans une séquence est égal ou supérieur au seuil, les trames précédant immédiatement la séquence de trame vérifiant le seuil contiennent alors un développement important.
EP01971992A 2000-09-13 2001-08-30 Procede de mise en evidence d'information importante dans un programme video par utilisation de reperes Withdrawn EP1320992A2 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US66091800A 2000-09-13 2000-09-13
PCT/EP2001/010112 WO2002023891A2 (fr) 2000-09-13 2001-08-30 Procede de mise en evidence d'information importante dans un programme video par utilisation de reperes
US660918 2003-09-13

Publications (1)

Publication Number Publication Date
EP1320992A2 true EP1320992A2 (fr) 2003-06-25

Family

ID=24651479

Family Applications (1)

Application Number Title Priority Date Filing Date
EP01971992A Withdrawn EP1320992A2 (fr) 2000-09-13 2001-08-30 Procede de mise en evidence d'information importante dans un programme video par utilisation de reperes

Country Status (3)

Country Link
EP (1) EP1320992A2 (fr)
JP (1) JP2004509529A (fr)
WO (1) WO2002023891A2 (fr)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4577774B2 (ja) * 2005-03-08 2010-11-10 Kddi株式会社 スポーツ映像の分類装置およびログ生成装置
WO2007073347A1 (fr) * 2005-12-19 2007-06-28 Agency For Science, Technology And Research Annotation d'un court-metrage video et generation video personnalisee
US9047374B2 (en) * 2007-06-08 2015-06-02 Apple Inc. Assembling video content
JP2011015129A (ja) * 2009-07-01 2011-01-20 Mitsubishi Electric Corp 画質調整装置
JP6354229B2 (ja) 2014-03-17 2018-07-11 富士通株式会社 抽出プログラム、方法、及び装置
JP6427902B2 (ja) 2014-03-17 2018-11-28 富士通株式会社 抽出プログラム、方法、及び装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU719329B2 (en) * 1997-10-03 2000-05-04 Canon Kabushiki Kaisha Multi-media editing method and apparatus

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3728775B2 (ja) * 1995-08-18 2005-12-21 株式会社日立製作所 動画像の特徴場面検出方法及び装置
KR100206804B1 (ko) * 1996-08-29 1999-07-01 구자홍 하일라이트 부분 자동 선택 녹화 방법
JPH1155613A (ja) * 1997-07-30 1999-02-26 Hitachi Ltd 記録および/または再生装置およびこれに用いられる記録媒体
KR20010041607A (ko) * 1998-03-04 2001-05-25 더 트러스티스 오브 콜롬비아 유니버시티 인 더 시티 오브 뉴욕 화상 및 비디오 검색을 위한 의미 비주얼 템플릿 생성방법 및 시스템
US6163510A (en) * 1998-06-30 2000-12-19 International Business Machines Corporation Multimedia search and indexing system and method of operation using audio cues with signal thresholds

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU719329B2 (en) * 1997-10-03 2000-05-04 Canon Kabushiki Kaisha Multi-media editing method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHANG Y.-L. ET AL: "Integrated image and speech analysis for content-based video indexing", PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIACOMPUTING AND SYSTEMS, 17 June 1996 (1996-06-17), LOS ALAMITOS, CA, US, pages 306 - 313 *
See also references of WO0223891A3 *

Also Published As

Publication number Publication date
WO2002023891A3 (fr) 2002-05-30
WO2002023891A2 (fr) 2002-03-21
JP2004509529A (ja) 2004-03-25

Similar Documents

Publication Publication Date Title
US7339992B2 (en) System and method for extracting text captions from video and generating video summaries
Truong et al. Scene extraction in motion pictures
JP4643829B2 (ja) ビデオフレーム中の検出されたテキストを使用してビデオコンテンツを分析するシステム及び方法
JP5420199B2 (ja) 映像解析装置、映像解析方法、ダイジェスト自動作成システム及びハイライト自動抽出システム
US7120873B2 (en) Summarization of sumo video content
US8340498B1 (en) Extraction of text elements from video content
CN110381366B (zh) 赛事自动化报道方法、系统、服务器及存储介质
EP2089820B1 (fr) Procédé et appareil pour générer un résumé d'un flux de données vidéo
JP6557592B2 (ja) 映像シーン分割装置及び映像シーン分割プログラム
TW201907736A (zh) 視訊摘要的生成方法及裝置
US8051446B1 (en) Method of creating a semantic video summary using information from secondary sources
Snoek et al. Time interval maximum entropy based event indexing in soccer video
EP1320992A2 (fr) Procede de mise en evidence d'information importante dans un programme video par utilisation de reperes
Chen et al. Knowledge-based approach to video content classification
KR102754939B1 (ko) 스포츠 경기 요약 영상 생성 장치 및 생성 방법
CN117221669B (zh) 一种弹幕生成方法及装置
KR102754941B1 (ko) 스포츠 경기 요약 영상 생성 장치 및 스포츠 경기 요약 영상 생성 방법
US20070124678A1 (en) Method and apparatus for identifying the high level structure of a program
Brezeale Learning video preferences using visual features and closed captions
KR102889621B1 (ko) 동영상 인덱싱 장치 및 방법
Bailer et al. Skimming rushes video using retake detection
Hsieh et al. Constructing a bowling information system with video content analysis
Gupta A Survey on Video Content Analysis
Gao et al. A study of intelligent video indexing system
Lotfi A Novel Hybrid System Based on Fractal Coding for Soccer Retrieval from Video Database

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030414

AK Designated contracting states

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17Q First examination report despatched

Effective date: 20090312

APBK Appeal reference recorded

Free format text: ORIGINAL CODE: EPIDOSNREFNE

APBN Date of receipt of notice of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA2E

APBR Date of receipt of statement of grounds of appeal recorded

Free format text: ORIGINAL CODE: EPIDOSNNOA3E

APAF Appeal reference modified

Free format text: ORIGINAL CODE: EPIDOSCREFNE

APBT Appeal procedure closed

Free format text: ORIGINAL CODE: EPIDOSNNOA9E

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: KONINKLIJKE PHILIPS N.V.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20130301