CN116935275A - Scene segmentation method and device based on area of interest - Google Patents
Scene segmentation method and device based on area of interest Download PDFInfo
- Publication number
- CN116935275A CN116935275A CN202310895397.4A CN202310895397A CN116935275A CN 116935275 A CN116935275 A CN 116935275A CN 202310895397 A CN202310895397 A CN 202310895397A CN 116935275 A CN116935275 A CN 116935275A
- Authority
- CN
- China
- Prior art keywords
- interest
- region
- transition frame
- information
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Description
技术领域Technical field
本申请涉及图像处理技术领域,具体涉及一种基于感兴趣区域的场景切分方法及装置。The present application relates to the field of image processing technology, and specifically to a scene segmentation method and device based on regions of interest.
背景技术Background technique
视频中场景片段的切换称为转场,转场包括硬转场和软转场(如淡入淡出、溶解、扫换、叠化等)。对视频进行场景切分是将一个完整的视频根据转场分割出具有完整语义的视频处理单元—shot(镜头)。场景切分是很多视频后处理流程的必要条件,也常用于自动化的视频标注、基于内容的视频检索等等方面。视频平台的发展也带动了用户创作视频的热潮,但视频创作通常需要用户将视频中所需要的片段进行场景切分,再根据自己的理解将视频片段进行组合创作得到新的视频内容。对视频片段进行场景切分相当耗时,自动化的场景切分可以帮助用户降低创作成本。The switching of scene segments in a video is called a transition, which includes hard transitions and soft transitions (such as fade in and out, dissolve, sweep, dissolve, etc.). Scene segmentation of videos is to segment a complete video into video processing units with complete semantics - shots (shots) based on transitions. Scene segmentation is a necessary condition for many video post-processing processes, and is also commonly used in automated video annotation, content-based video retrieval, etc. The development of video platforms has also led to an upsurge in users creating videos. However, video creation usually requires users to segment the required segments in the video into scenes, and then combine the video segments according to their own understanding to create new video content. Scene segmentation of video clips is very time-consuming. Automated scene segmentation can help users reduce creation costs.
现有的自动场景切分大多是基于视频帧中不同场景片段相邻帧间图像特征的变化,如色彩、亮度、统计特征等的巨大差异来进行场景切分。如基于连续帧间的相关性进行场景切分时,对前后帧的相似度进行计算,如利用图像相似度计算,具体的,可以利用图像灰度直方图,计算得到灰度直方图相似性,从而确定前后帧的相似度,根据前后帧的相似度进行场景切分。对于部分背景快速变化的视频片段,往往会被误分为不同的场景,如图1a和图1b所示,图1a和图1b为打篮球的前后帧,运镜较快时,前后帧相似度对比会被认为不同的场景,导致场景切分错误。或者,基于边缘轮廓检测的算法,根据前后帧对应的物体边缘和轮廓的不同来判断是否场景切分,虽然可以有效地区分视频对象的运动和场景的突变,检测叠化或者淡入淡出,但对于背景快速变化或者比赛中的镜头追踪的视频镜头易被误检测,导致场景切分错误。Most of the existing automatic scene segmentation is based on the changes in image characteristics between adjacent frames of different scene segments in video frames, such as the huge differences in color, brightness, statistical characteristics, etc. For example, when performing scene segmentation based on the correlation between consecutive frames, the similarity between the previous and next frames is calculated, such as using image similarity calculation. Specifically, the image grayscale histogram can be used to calculate the grayscale histogram similarity. This determines the similarity between the previous and next frames, and performs scene segmentation based on the similarity between the previous and next frames. For some video clips with rapidly changing backgrounds, they are often mistakenly divided into different scenes, as shown in Figure 1a and Figure 1b. Figure 1a and Figure 1b show the before and after frames of playing basketball. When the camera moves quickly, the similarity between the front and back frames Contrast will be considered different scenes, resulting in scene segmentation errors. Alternatively, an algorithm based on edge contour detection determines whether to segment the scene based on the difference in object edges and contours corresponding to the previous and subsequent frames. Although it can effectively distinguish the movement of video objects and sudden changes in the scene, and detect blending or fading, but for Video shots with rapid background changes or camera tracking during a game can easily be misdetected, leading to incorrect scene segmentation.
发明内容Contents of the invention
鉴于上述问题,提出了本申请实施例以便提供一种克服上述问题或者至少部分地解决上述问题的基于感兴趣区域的场景切分方法及装置。In view of the above problems, embodiments of the present application are proposed to provide a scene segmentation method and device based on regions of interest that overcome the above problems or at least partially solve the above problems.
根据本申请实施例的第一方面,提供了一种基于感兴趣区域的场景切分方法,其包括:According to the first aspect of the embodiment of the present application, a scene segmentation method based on regions of interest is provided, which includes:
对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的第一信息;Pre-analyze the video to be segmented, determine the area of interest of the video to be segmented, and extract the first information of the area of interest;
对待切分视频进行场景预切分处理,得到至少一个预选转场帧;Perform scene pre-segmentation processing on the video to be segmented to obtain at least one pre-selected transition frame;
对预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息;Perform area of interest detection on the preselected transition frame to obtain second information of the area of interest;
根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分。Matching is performed based on the first information and the second information to determine transition frames that meet the preset transition conditions to perform scene segmentation.
可选地,对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的第一信息进一步包括:Optionally, pre-analyzing the video to be segmented, determining the region of interest of the video to be segmented, and extracting the first information of the region of interest further includes:
对待切分视频进行预分析,确定待切分视频中视频帧的感兴趣区域,并提取得到视频帧中感兴趣区域的第一总个数及坐标信息;坐标信息包括感兴趣区域的角坐标信息、高度信息及宽度信息。Pre-analyze the video to be segmented, determine the area of interest of the video frame in the video to be segmented, and extract the first total number and coordinate information of the area of interest in the video frame; the coordinate information includes the angular coordinate information of the area of interest , height information and width information.
可选地,感兴趣区域对应的对象相同或者不同;Optionally, the objects corresponding to the regions of interest are the same or different;
提取得到视频帧中感兴趣区域的第一总个数具体为:统计视频帧中不同对象的感兴趣区域,得到感兴趣区域的第一总个数。Extracting the first total number of regions of interest in the video frame is specifically: counting the regions of interest of different objects in the video frame to obtain the first total number of regions of interest.
可选地,对待切分视频进行预分析,确定待切分视频中视频帧的感兴趣区域进一步包括:Optionally, pre-analyzing the video to be segmented and determining the area of interest of the video frame in the video to be segmented further includes:
对待切分视频进行预分析,确定待切分视频中一帧或者多帧视频帧的感兴趣区域。Pre-analyze the video to be segmented and determine the area of interest of one or more video frames in the video to be segmented.
可选地,方法还包括:Optionally, methods also include:
根据感兴趣区域的坐标信息确定感兴趣区域对应的检测模板图像。The detection template image corresponding to the area of interest is determined based on the coordinate information of the area of interest.
可选地,对预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息进一步包括:Optionally, performing area of interest detection on the preselected transition frame to obtain the second information of the area of interest further includes:
根据检测模板图像对预选转场帧进行感兴趣区域检测,得到预选转场帧包含的感兴趣区域,并统计得到预选转场帧包含的感兴趣区域的第二总个数。Perform area-of-interest detection on the pre-selected transition frame based on the detection template image to obtain the area of interest included in the pre-selected transition frame, and obtain a second total number of areas of interest included in the pre-selected transition frame.
可选地,根据检测模板图像对预选转场帧进行感兴趣区域检测,得到预选转场帧包含的感兴趣区域,并统计得到预选转场帧包含的感兴趣区域的第二总个数进一步包括:Optionally, perform area of interest detection on the preselected transition frame according to the detection template image, obtain the area of interest included in the preselected transition frame, and obtain the second total number of areas of interest included in the preselected transition frame by statistics, further including: :
根据检测模板图像对预选转场帧进行感兴趣区域检测,得到预选转场帧包含的多个感兴趣区域;Perform area of interest detection on the preselected transition frame based on the detection template image to obtain multiple areas of interest contained in the preselected transition frame;
统计预选转场帧中各个不同对象的感兴趣区域,得到感兴趣区域的第二总个数;其中,若多个感兴趣区域为同一对象,记录多个感兴趣区域的个数为1;Count the areas of interest of each different object in the preselected transition frame to obtain the second total number of areas of interest; where, if multiple areas of interest are the same object, the number of recorded multiple areas of interest is 1;
统计得到预选转场帧包含的各个感兴趣区域的第二总个数。The second total number of each region of interest contained in the preselected transition frame is obtained by statistics.
可选地,根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分进一步包括:Optionally, matching according to the first information and the second information to determine the transition frame that meets the preset transition conditions for scene segmentation further includes:
根据第一信息的第一总个数和第二信息的第二总个数,计算得到第二总个数与第一总个数的比值;Calculate the ratio of the second total number to the first total number based on the first total number of the first information and the second total number of the second information;
判断比值是否超过预设第一转场阈值;Determine whether the ratio exceeds the preset first transition threshold;
若是,则确定预选转场帧不符合预设转场条件。If so, it is determined that the preselected transition frame does not meet the preset transition conditions.
可选地,对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的第一信息进一步包括:Optionally, pre-analyzing the video to be segmented, determining the region of interest of the video to be segmented, and extracting the first information of the region of interest further includes:
对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的的第一总个数、坐标信息及各个感兴趣区域的对象的权重值;Pre-analyze the video to be segmented, determine the area of interest of the video to be segmented, and extract the first total number of areas of interest, coordinate information, and weight values of objects in each area of interest;
根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分进一步包括:Matching the first information and the second information to determine the transition frame that meets the preset transition conditions for scene segmentation further includes:
根据第一信息,计算各个感兴趣区域的对象的权重值之和,得到第一权重值;According to the first information, calculate the sum of the weight values of the objects in each area of interest to obtain the first weight value;
根据第二信息,确定预选转场帧中包含的感兴趣区域,将对应的感兴趣区域的对象的权重值累加,得到第二权重值;According to the second information, determine the area of interest contained in the preselected transition frame, and accumulate the weight values of the objects in the corresponding area of interest to obtain the second weight value;
计算第二权重值与第一权重值的比值,判断比值是否超过预设第二转场阈值;Calculate the ratio of the second weight value to the first weight value, and determine whether the ratio exceeds the preset second transition threshold;
若是,则确定预选转场帧不符合预设转场条件。If so, it is determined that the preselected transition frame does not meet the preset transition conditions.
可选地,对待切分视频进行场景预切分处理,得到至少一个预选转场帧进一步包括:Optionally, performing scene pre-segmentation processing on the video to be segmented to obtain at least one preselected transition frame further includes:
基于场景检测,对待切分视频进行场景预切分处理,得到包含至少一个预选转场帧号的预选转场帧号集合;其中,预选转场帧号与预选转场帧一一对应。Based on scene detection, perform scene pre-segmentation processing on the video to be segmented to obtain a preselected transition frame number set containing at least one preselected transition frame number; wherein the preselected transition frame number corresponds to the preselected transition frame one-to-one.
可选地,对预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息,根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分进一步包括:Optionally, perform area of interest detection on the preselected transition frame to obtain second information of the area of interest, and perform matching based on the first information and the second information to determine the transition frame that meets the preset transition conditions to perform the scene Segments further include:
遍历预选转场帧号集合,对任一预选转场帧号对应的预选转场帧进行感兴趣区域检测,得到预选转场帧的感兴趣区域的第二信息;根据第一信息和第二信息进行匹配,判断预选转场帧是否为符合预设转场条件的转场帧;若否,删除预选转场帧号;获取预选转场帧号集合的下一预选转场帧号,继续遍历,直至遍历完成预选转场帧号集合,以根据预选转场帧号集合中剩余的预选转场帧号进行场景切分。Traverse the set of preselected transition frame numbers, perform area of interest detection on the preselected transition frame corresponding to any preselected transition frame number, and obtain the second information of the area of interest of the preselected transition frame; according to the first information and the second information Match and determine whether the preselected transition frame is a transition frame that meets the preset transition conditions; if not, delete the preselected transition frame number; obtain the next preselected transition frame number of the preselected transition frame number set, and continue traversing, Until the traversal completes the preselected transition frame number set, the scene is segmented according to the remaining preselected transition frame numbers in the preselected transition frame number set.
根据本申请实施例的第二方面,提供了一种基于感兴趣区域的场景切分装置,其包括:According to the second aspect of the embodiment of the present application, a scene segmentation device based on a region of interest is provided, which includes:
感兴趣区域模块,适于对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的第一信息;The region of interest module is suitable for pre-analyzing the video to be segmented, determining the region of interest of the video to be segmented, and extracting the first information of the region of interest;
预切分模块,适于对待切分视频进行场景预切分处理,得到至少一个预选转场帧;The pre-segmentation module is suitable for performing scene pre-segmentation processing on the video to be segmented to obtain at least one pre-selected transition frame;
检测模块,适于对预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息;The detection module is adapted to detect the area of interest on the pre-selected transition frame and obtain the second information of the area of interest;
匹配模块,适于根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分。The matching module is adapted to perform matching according to the first information and the second information to determine transition frames that meet the preset transition conditions for scene segmentation.
根据本申请实施例的第三方面,提供了一种计算设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;According to a third aspect of the embodiment of the present application, a computing device is provided, including: a processor, a memory, a communication interface, and a communication bus. The processor, the memory, and the communication interface complete each other through the communication bus. communication between;
所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行上述基于感兴趣区域的场景切分方法对应的操作。The memory is used to store at least one executable instruction, and the executable instruction causes the processor to perform operations corresponding to the above-mentioned region-of-interest-based scene segmentation method.
根据本申请实施例的第四方面,提供了一种计算机存储介质,所述存储介质中存储有至少一可执行指令,所述可执行指令使处理器执行如上述基于感兴趣区域的场景切分方法对应的操作。According to a fourth aspect of the embodiments of the present application, a computer storage medium is provided. At least one executable instruction is stored in the storage medium. The executable instruction causes the processor to perform the above-mentioned scene segmentation based on the region of interest. The operation corresponding to the method.
根据本申请的提供的基于感兴趣区域的场景切分方法及装置,对场景预切分得到预选转场帧后,可以对预选转场帧进行感兴趣区域检测,利用检测的感兴趣区域的第二信息与预分析确定的感兴趣区域的第一信息进行判断,从而可以避免误检测,达到精准场景切分。According to the scene segmentation method and device based on the area of interest provided in this application, after pre-segmenting the scene to obtain the pre-selected transition frame, the area of interest can be detected on the pre-selected transition frame, and the third area of the detected area of interest can be used. The second information is judged with the first information of the area of interest determined by pre-analysis, so that false detection can be avoided and accurate scene segmentation can be achieved.
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。The above description is only an overview of the technical solutions of the present application. In order to have a clearer understanding of the technical means of the present application, they can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present application more obvious and understandable. , the specific implementation methods of the present application are specifically listed below.
附图说明Description of the drawings
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are for the purpose of illustrating preferred embodiments only and are not to be construed as limiting the application. Also throughout the drawings, the same reference characters are used to designate the same components. In the attached picture:
图1a示出了打篮球视频前一帧示意图;Figure 1a shows a schematic diagram of the previous frame of the basketball video;
图1b示出了打篮球视频后一帧示意图;Figure 1b shows a schematic diagram of the last frame of the basketball video;
图2示出了根据本申请一个实施例的基于感兴趣区域的场景切分方法的流程图;Figure 2 shows a flow chart of a scene segmentation method based on regions of interest according to an embodiment of the present application;
图3示出了根据本申请另一个实施例的基于感兴趣区域的场景切分方法的流程图;Figure 3 shows a flow chart of a scene segmentation method based on regions of interest according to another embodiment of the present application;
图4示出了根据本申请一个实施例的基于感兴趣区域的场景切分装置的结构示意图;Figure 4 shows a schematic structural diagram of a scene segmentation device based on regions of interest according to an embodiment of the present application;
图5示出了根据本申请一个实施例的一种计算设备的结构示意图。Figure 5 shows a schematic structural diagram of a computing device according to an embodiment of the present application.
具体实施方式Detailed ways
下面将参照附图更详细地描述本申请的示例性实施例。虽然附图中显示了本申请的示例性实施例,然而应当理解,可以以各种形式实现本申请而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本申请,并且能够将本申请的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a thorough understanding of the present application, and to fully convey the scope of the present application to those skilled in the art.
首先,对本申请一个或多个实施例涉及的名词术语进行解释。First, the terminology involved in one or more embodiments of this application is explained.
转场:从一个视频场景到另一个视频场景之间的过渡或者转换;Transition: a transition or transition from one video scene to another;
场景切分:根据景别或者场景的变化将视频分割成独立的视频片段;Scene segmentation: Split the video into independent video clips based on scene or scene changes;
图像相似度计算:主要度量两个图像之间内容的相似程度,根据计算分值的高低判断图像的相似程度,有多种度量算法可以判别图像的相似度,如CORR(CorrelationCoefficient)相关性;Image similarity calculation: mainly measures the similarity of content between two images, and judges the similarity of images based on the calculated score. There are a variety of measurement algorithms that can determine the similarity of images, such as CORR (CorrelationCoefficient) correlation;
图像灰度直方图:图像中灰度的分布规律,直观地表现了图像中各灰度级的占比,体现出图像的亮度和对比度信息;Image grayscale histogram: The distribution pattern of grayscale in the image, intuitively showing the proportion of each grayscale in the image, reflecting the brightness and contrast information of the image;
灰度直方图相似性计算:将图像转换为灰度图,统计其直方图,对两个图像灰度图的直方图进行相似性计算,现有的直方图相似性度量方法主要有距离度量法,如Manhattan距离、Euclidean距离、Hausdorff距离、中心矩法、X2统计距离等。Grayscale histogram similarity calculation: convert the image into a grayscale image, count its histograms, and perform similarity calculation on the histograms of the grayscale images of the two images. The existing histogram similarity measurement methods mainly include the distance measurement method. , such as Manhattan distance, Euclidean distance, Hausdorff distance, central moment method, X2 statistical distance, etc.
模板匹配:模式识别方法,研究某一特定对象物的图案位于图像的什么地方,进而识别对象。Template matching: Pattern recognition method, studying where the pattern of a specific object is located in the image, and then identifying the object.
图2示出了根据本申请一实施例的基于感兴趣区域的场景切分方法的流程图,如图2所示,该方法包括如下步骤:Figure 2 shows a flow chart of a scene segmentation method based on regions of interest according to an embodiment of the present application. As shown in Figure 2, the method includes the following steps:
步骤S201,对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的第一信息。Step S201: Pre-analyze the video to be segmented, determine the region of interest of the video to be segmented, and extract first information of the region of interest.
对于待切分视频,现有技术根据相邻帧间图像特征的变化进行场景切分时,当运镜较快、背景快速变化时,易造成场景切分错误。本实施例基于感兴趣区域,在场景切分时,通过对转场帧进行感兴趣区域检测,可以确定是否为转场,避免了因背景快速变化,而前景的人物、物品等没有变化,导致的场景切分错误,去除其中不符合预设转场条件的预选转场帧号,保障场景切分准确。For videos to be segmented, when the existing technology performs scene segmentation based on changes in image features between adjacent frames, when the camera moves quickly and the background changes rapidly, scene segmentation errors are likely to occur. This embodiment is based on the area of interest. When the scene is segmented, by detecting the area of interest on the transition frame, it can be determined whether it is a transition. This avoids the problem of rapid changes in the background while the characters, items, etc. in the foreground do not change. If there are scene segmentation errors, remove the preselected transition frame numbers that do not meet the preset transition conditions to ensure accurate scene segmentation.
具体的,对待切分视频先进行预分析,确定其中包含的roi(region of interest,感兴趣区域)。如通过对其中的视频帧进行图像分析,结合实施情况确定感兴趣区域,或者,还可以提供接口,方便用户在视频帧中框选确定感兴趣区域等,此处不做限定。Specifically, the video to be segmented is pre-analyzed to determine the roi (region of interest, region of interest) contained in it. For example, by performing image analysis on the video frames and determining the area of interest based on the implementation situation, or an interface can be provided to facilitate the user to select the area of interest in the video frame, etc., which are not limited here.
在确定感兴趣区域后,可以针对各个感兴趣区域进行信息提取,得到感兴趣区域的第一信息。第一信息可以包括如视频帧中感兴趣区域的总个数、各个感兴趣区域的坐标信息等。坐标信息可以确定感兴趣区域在视频帧中的位置,方便提取感兴趣区域内的对象等。After the area of interest is determined, information can be extracted for each area of interest to obtain the first information of the area of interest. The first information may include, for example, the total number of regions of interest in the video frame, coordinate information of each region of interest, etc. The coordinate information can determine the position of the area of interest in the video frame, making it easy to extract objects in the area of interest, etc.
步骤S202,对待切分视频进行场景预切分处理,得到至少一个预选转场帧。Step S202: Perform scene pre-segmentation processing on the video to be segmented to obtain at least one preselected transition frame.
对于待切分视频,可以对其进行场景预切分处理,此处的场景预切分处理不是最终的场景切分,可以先通过场景检测,对待切分视频进行场景预切分处理,得到按照顺序排列的至少一个预选转场帧。For the video to be segmented, scene pre-segmentation processing can be performed on it. The scene pre-segmentation processing here is not the final scene segmentation. You can first perform scene pre-segmentation processing on the video to be segmented to obtain the following results: At least one preselected transition frame in sequence.
进一步,步骤S202和步骤S201的执行顺序不做限定,可以根据实施情况任选一步骤先执行。Furthermore, the execution order of step S202 and step S201 is not limited, and any step can be executed first according to the implementation situation.
步骤S203,对预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息。Step S203: Perform area of interest detection on the preselected transition frame to obtain second information of the area of interest.
对于预选转场帧,先进行感兴趣区域检测,检测到预选转场帧中对应的感兴趣区域。感兴趣区域检测可以通过如模板匹配,将预分析确定的感兴趣区域作为模板,与预选转场帧进行模板匹配,检测得到预选转场帧中包含的感兴趣区域。For the preselected transition frame, area of interest detection is first performed, and the corresponding area of interest in the preselected transition frame is detected. The region of interest detection can be performed by, for example, template matching, using the region of interest determined by pre-analysis as a template, performing template matching with the preselected transition frame, and detecting the region of interest contained in the preselected transition frame.
基于预选转场帧中包含的感兴趣区域,可以得到感兴趣区域的第二信息。第二信息可以包括如各个感兴趣区域、统计各个感兴趣区域的总个数等。Based on the region of interest contained in the preselected transition frame, second information of the region of interest can be obtained. The second information may include, for example, each area of interest, statistics of the total number of each area of interest, etc.
步骤S204,根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分。Step S204: Match the first information and the second information to determine the transition frame that meets the preset transition conditions to perform scene segmentation.
根据第一信息和第二信息进行匹配,可以对预选转场帧进行判断,从而确定预选转场帧是否为符合预设转场条件的转场帧,以便基于符合预设转场条件的转场帧进行场景切分。具体的,如计算第一信息中感兴趣区域的总个数与第二信息中感兴趣区域的总个数的比值,根据比值可以确定预选转场帧中是否还包含多个感兴趣区域,若包含的感兴趣区域较多,说明视频场景没有转场,即预选转场帧不符合预设转场条件,若包含的感兴趣区域较少或者没有,说明视频场景已经转场,预选转场帧是符合预设转场条件的转场帧,可以基于其进行场景切分。By matching the first information and the second information, the preselected transition frame can be judged, thereby determining whether the preselected transition frame is a transition frame that meets the preset transition conditions, so that based on the transition that meets the preset transition conditions Frames are used to segment scenes. Specifically, for example, the ratio of the total number of regions of interest in the first information to the total number of regions of interest in the second information is calculated. Based on the ratio, it can be determined whether the preselected transition frame also contains multiple regions of interest. If If it contains more regions of interest, it means that the video scene has no transition, that is, the preselected transition frame does not meet the preset transition conditions. If it contains few or no regions of interest, it means that the video scene has transitioned, and the preselected transition frame It is a transition frame that meets the preset transition conditions, and the scene can be divided based on it.
根据本申请提供的基于感兴趣区域的场景切分方法,对场景预切分得到预选转场帧后,可以对预选转场帧进行感兴趣区域检测,利用检测的感兴趣区域的第二信息与预分析确定的感兴趣区域的第一信息进行判断,从而可以避免误检测,达到精准场景切分。According to the scene segmentation method based on the region of interest provided by this application, after pre-segmenting the scene to obtain the pre-selected transition frame, the region of interest can be detected on the pre-selected transition frame, and the second information of the detected region of interest is combined with the scene segmentation method. The first information of the area of interest determined by pre-analysis is used for judgment, so that false detection can be avoided and accurate scene segmentation can be achieved.
图3示出了根据本申请一实施例的基于感兴趣区域的场景切分方法的流程图,如图3所示,该方法包括以下步骤:Figure 3 shows a flow chart of a scene segmentation method based on regions of interest according to an embodiment of the present application. As shown in Figure 3, the method includes the following steps:
步骤S301,对待切分视频进行预分析,确定待切分视频中一帧或者多帧视频帧的感兴趣区域,并提取得到感兴趣区域的第一信息。Step S301: Pre-analyze the video to be segmented, determine the region of interest of one or more video frames in the video to be segmented, and extract first information of the region of interest.
对待切分视频进行预分析时,可以从待切分视频中选取一帧视频帧,如第一帧视频帧,确定该视频帧的感兴趣区域;或者,选取多帧视频帧,多帧视频帧可以为对应不同场景的视频帧等,确定多帧视频帧的感兴趣区域,此处不做限定。预分析可以通过图像处理、情节分析等进行预分析,如分析确定当前视频为打篮球,通过图像分析确定感兴趣区域为“篮球”等,也可以提供接口给用户来选取感兴趣区域,如用户输入“篮球”,从视频帧中分析确定“篮球”对应的感兴趣区域等。When pre-analyzing the video to be segmented, you can select a video frame from the video to be segmented, such as the first video frame, to determine the area of interest in the video frame; or select multiple video frames, multiple video frames The area of interest of multiple video frames can be determined for video frames corresponding to different scenes, etc., which is not limited here. Pre-analysis can be carried out through image processing, plot analysis, etc. For example, the current video is determined to be playing basketball, and the area of interest is determined to be "basketball" through image analysis. An interface can also be provided for the user to select the area of interest, such as the user Enter "basketball" and analyze and determine the area of interest corresponding to "basketball" from the video frame.
在确定待切分视频中视频帧的感兴趣区域后,提取得到视频帧中感兴趣区域的第一总个数及坐标信息。第一总个数即视频帧中包含的各个感兴趣区域的总数。进一步,感兴趣区域对应的对象可能为相同对象,如多个感兴趣区域的对象都是“金币”,或者,多个感兴趣区域的对象不同,包括如对象1、对象2、对象3……。统计第一总个数时,统计视频帧中不同对象的感兴趣区域,得到感兴趣区域的第一总个数。坐标信息用于确定感兴趣区域,包括感兴趣区域的角坐标信息、高度信息及宽度信息。角坐标信息可以为感兴趣区域任一角坐标信息,如左上角坐标信息等。坐标信息如(xi,yi,wi,hi),其中,(xi,yi)对应感兴趣区域的左上角坐标信息,wi为感兴趣区域的宽度信息,hi为感兴趣区域的高度信息,i为视频帧中感兴趣区域的排序序号。After determining the region of interest of the video frame in the video to be segmented, extract the first total number and coordinate information of the region of interest in the video frame. The first total number is the total number of each region of interest contained in the video frame. Furthermore, the objects corresponding to the areas of interest may be the same object, for example, the objects in multiple areas of interest are all "gold coins", or the objects in the multiple areas of interest are different, including object 1, object 2, object 3... . When counting the first total number, the regions of interest of different objects in the video frame are counted to obtain the first total number of regions of interest. The coordinate information is used to determine the area of interest, including the angular coordinate information, height information and width information of the area of interest. The angular coordinate information can be any angular coordinate information of the area of interest, such as upper left corner coordinate information, etc. The coordinate information is such as (xi, yi, wi, hi), where (xi, yi) corresponds to the coordinate information of the upper left corner of the area of interest, wi is the width information of the area of interest, hi is the height information of the area of interest, and i is The sorting number of the region of interest in the video frame.
进一步,在对待切分视频进行预分析确定待切分视频的感兴趣区域后,除提取得到感兴趣区域的第一总个数、坐标信息外,还可以确定各个感兴趣区域的对象的权重值,为不同对象可以设置不同的权重值,如“篮球”对象权重值0.6,“篮框”对象权重值0.3等,通过不同的权重值,标识感兴趣区域的对象的重要度,更精准区分是否转场。以上为举例说明,此处不做限定。Further, after pre-analyzing the video to be segmented to determine the area of interest of the video to be segmented, in addition to extracting the first total number and coordinate information of the area of interest, the weight value of the objects in each area of interest can also be determined. , different weight values can be set for different objects, such as the "basketball" object weight value 0.6, the "basket" object weight value 0.3, etc. Through different weight values, the importance of objects in the area of interest can be identified, and more accurately distinguish whether Transitions. The above are examples and are not limited here.
步骤S302,基于场景检测,对待切分视频进行场景预切分处理,得到包含多个预选转场帧号的预选转场帧号集合。Step S302: Based on scene detection, perform scene pre-segmentation processing on the video to be segmented to obtain a set of preselected transition frame numbers including multiple preselected transition frame numbers.
基于场景检测,通过如x264编码器中场景检测算法等,通过对每一视频帧计算一个度量值,估计与前一帧的不同程度,对待切分视频进行场景预切分处理,得到包含多个预选转场帧号的预选转场帧号集合。预选转场帧号按照顺序依次排序,每个预选转场帧号对应一个预选转场帧。Based on scene detection, such as the scene detection algorithm in the x264 encoder, by calculating a metric value for each video frame to estimate the degree of difference from the previous frame, the video to be segmented is subjected to scene pre-segmentation processing to obtain multiple A collection of preselected transition frame numbers. The preselected transition frame numbers are sorted in order, and each preselected transition frame number corresponds to a preselected transition frame.
步骤S301和步骤S302执行顺序不做限定,可以根据实施情况任选一步骤先执行。The execution order of step S301 and step S302 is not limited, and any step can be executed first according to the implementation situation.
步骤S303,根据感兴趣区域的坐标信息确定感兴趣区域对应的检测模板图像。Step S303: Determine the detection template image corresponding to the area of interest based on the coordinate information of the area of interest.
根据步骤S301提取的感兴趣区域的坐标信息,可以定位视频帧对应的坐标位置,得到感兴趣区域对应的检测模板图像,方便后续利用感兴趣区域对应的检测模板图像来进行感兴趣区域检测。According to the coordinate information of the region of interest extracted in step S301, the coordinate position corresponding to the video frame can be located, and a detection template image corresponding to the region of interest can be obtained, which facilitates subsequent use of the detection template image corresponding to the region of interest to detect the region of interest.
步骤S304,遍历预选转场帧号集合,对任一预选转场帧号对应的预选转场帧进行感兴趣区域检测,得到预选转场帧的感兴趣区域的第二信息。Step S304: Traverse the set of preselected transition frame numbers, perform area of interest detection on the preselected transition frame corresponding to any preselected transition frame number, and obtain second information of the area of interest of the preselected transition frame.
预选转场帧号集合中包含的各个预选转场帧号中可能存在非转场的情况,还需要遍历预选转场帧号集合,对其中各个预选转场帧号对应的预选转场帧进一步判断。There may be non-transitions in each preselected transition frame number included in the preselected transition frame number set. It is also necessary to traverse the preselected transition frame number set and further determine the preselected transition frame corresponding to each preselected transition frame number. .
遍历时,按照顺序依次遍历,先获取一个预选转场帧号,对其对应的预选转场帧进行感兴趣区域检测,如根据检测模板图像,对预选转场帧进行感兴趣区域检测,通过模板匹配,在预选转场帧中查找是否存在相同或者相似的检测模板图像,确定其对应的位置,从而得到预选转场帧包含的感兴趣区域。When traversing, traverse in order, first obtain a preselected transition frame number, and perform area of interest detection on its corresponding preselected transition frame. For example, based on the detection template image, perform area of interest detection on the preselected transition frame through the template. Matching: Find whether there is the same or similar detection template image in the preselected transition frame, determine its corresponding position, and obtain the area of interest contained in the preselected transition frame.
对预选转场帧包含的感兴趣区域进行统计,可以得到预选转场帧包含的感兴趣区域的第二总个数。第二总个数即预选转场帧包含的感兴趣区域的总个数。预选转场帧包含的多个感兴趣区域的对象相同或者不同,统计预选转场帧中各个不同对象的感兴趣区域,得到感兴趣区域的第二总个数。其中,若多个感兴趣区域为同一对象,则记录多个感兴趣区域的个数为1,最终统计得到转场帧包含的各个感兴趣区域的第二总个数。By counting the regions of interest included in the preselected transition frames, the second total number of regions of interest included in the preselected transition frames can be obtained. The second total number is the total number of regions of interest included in the preselected transition frame. The objects in the multiple regions of interest contained in the preselected transition frame are the same or different, and the regions of interest of the different objects in the preselected transition frame are counted to obtain the second total number of regions of interest. Wherein, if multiple regions of interest are the same object, the number of multiple regions of interest recorded is 1, and finally the second total number of regions of interest included in the transition frame is obtained.
步骤S305,根据第一信息和第二信息,判断预选转场帧是否符合预设转场条件;若否,删除预选转场帧号;获取预选转场帧号集合的下一预选转场帧号,继续遍历,直至遍历完成预选转场帧号集合,以根据预选转场帧号集合中剩余的预选转场帧号进行场景切分。Step S305, based on the first information and the second information, determine whether the preselected transition frame meets the preset transition conditions; if not, delete the preselected transition frame number; obtain the next preselected transition frame number of the preselected transition frame number set , continue traversing until the traversal completes the preselected transition frame number set, so as to perform scene segmentation based on the remaining preselected transition frame numbers in the preselected transition frame number set.
在对预选转场帧进行感兴趣区域检测后,根据第一信息和得到的第二信息,来判断预选转场帧是否符合预设转场条件。具体的,可以根据第一信息的第一总个数和第二信息的第二总个数,计算得到第二总个数与第一总个数的比值。如第一总个数为5,第二总个数为3,比值为3/5=0.6。判断比值是否超过预设第一转场阈值,预设第一转场阈值可以根据实施情况设置,如0.5,比值超过预设第一转场阈值,即预选转场帧中还包含较多的感兴趣区域,判断其不符合预设转场条件。若包含的感兴趣区域较少或者没有,说明视频场景已经转场,预选转场帧是符合预设转场条件的转场帧,则无需处理预选转场帧号。或者,根据第一信息,根据各个感兴趣区域的对象的权重值,计算得到各个感兴趣区域的对象的权重值之和,即得到第一权重值。根据第二信息,确定预选转场帧中包含的各个感兴趣区域,将对应的感兴趣区域的对象的权重值进行累加,得到第二权重值。计算第二权重值与第一权重值的比值,判断比值是否超过预设第二转场阈值,若是,则说明预选转场帧中包含的感兴趣区域为较重要的感兴趣区域,预选转场帧依然属于当前场景的视频帧,则判断不符合预设转场条件。若比值不超过预设第二转场阈值,说明视频场景已经转场,预选转场帧是符合预设转场条件的转场帧,则无需处理预选转场帧号。以上为举例说明,具体根据实施情况设置,此处不做限定。After the region of interest is detected on the preselected transition frame, it is determined whether the preselected transition frame meets the preset transition conditions based on the first information and the obtained second information. Specifically, the ratio of the second total number to the first total number can be calculated based on the first total number of first information and the second total number of second information. If the first total number is 5 and the second total number is 3, the ratio is 3/5=0.6. Determine whether the ratio exceeds the preset first transition threshold. The preset first transition threshold can be set according to the implementation situation, such as 0.5. If the ratio exceeds the preset first transition threshold, that is, the preselected transition frame also contains more senses. The area of interest is judged not to meet the preset transition conditions. If there are few or no regions of interest, it means that the video scene has transitioned, and the preselected transition frame is a transition frame that meets the preset transition conditions, and there is no need to process the preselected transition frame number. Or, according to the first information, based on the weight values of the objects in each area of interest, the sum of the weight values of the objects in each area of interest is calculated, that is, the first weight value is obtained. According to the second information, each area of interest included in the preselected transition frame is determined, and the weight values of the objects in the corresponding area of interest are accumulated to obtain a second weight value. Calculate the ratio of the second weight value to the first weight value, and determine whether the ratio exceeds the preset second transition threshold. If so, it means that the area of interest included in the preselected transition frame is a more important area of interest, and the preselected transition If the frame still belongs to the video frame of the current scene, it is judged that it does not meet the preset transition conditions. If the ratio does not exceed the preset second transition threshold, it means that the video scene has transitioned, and the preselected transition frame is a transition frame that meets the preset transition conditions, and there is no need to process the preselected transition frame number. The above is an example. The specific settings are based on the implementation situation and are not limited here.
当判断预选转场帧不符合预设转场条件,则删除对应的预选转场帧号,后续不会根据该预选转场帧号进行场景切分。继续获取预选转场帧号集合的下一预选转场帧号,基于下一预选转场帧号对应的预选转场帧执行步骤S304-S305,直至遍历完成预选转场帧号集合,即对所有预选转场帧进行判断完成后,预选转场帧号集合中剩余的预选转场帧号即准确的转场帧号,可以根据预选转场帧号集合中剩余的预选转场帧号进行场景切分,得到准确的场景切分结果。When it is determined that the preselected transition frame does not meet the preset transition conditions, the corresponding preselected transition frame number will be deleted, and subsequent scene segmentation will not be performed based on the preselected transition frame number. Continue to obtain the next preselected transition frame number of the preselected transition frame number set, and execute steps S304-S305 based on the preselected transition frame corresponding to the next preselected transition frame number until the traversal of the preselected transition frame number set is completed, that is, for all After the preselected transition frame is judged, the remaining preselected transition frame numbers in the preselected transition frame number set are the accurate transition frame numbers. Scene switching can be performed based on the remaining preselected transition frame numbers in the preselected transition frame number set. points to obtain accurate scene segmentation results.
根据本申请提供的基于感兴趣区域的场景切分方法,对待切分视频进行场景预切分,可以得到多个预选转场帧号的预选转场帧号集合。对待切分视频进行预分析,确定待切分视频中一帧或者多帧视频帧的感兴趣区域,提取得到感兴趣区域的第一信息;根据感兴趣区域确定检测模板对象,对预选转场帧号对应的预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息。根据第一信息的第一总个数和第二信息的第二总个数,计算得到第二总个数与第一总个数的比值,若判断比值超过预设第一转场阈值,说明预选转场帧中还包含较多的感兴趣区域,不符合预设转场条件,或者,根据第一信息的各个感兴趣区域的对象的权重值之和,得到第一权重值,根据第二信息,确定预选转场帧中包含的各个感兴趣区域,将对应的感兴趣区域的对象的权重值进行累加,得到第二权重值。计算第二权重值与第一权重值的比值,若判断比值超过预设第二转场阈值,则不符合预设转场条件。对于不符合预设转场条件的预选转场帧,删除预选转场帧号,获取预选转场帧号集合的下一预选转场帧号继续遍历,直至完成预选转场帧号集合,根据预选转场帧号集合中剩余的预选转场帧号进行场景切分,保障场景切分准确。According to the scene segmentation method based on the region of interest provided by this application, scene pre-segmentation is performed on the video to be segmented, and a set of preselected transition frame numbers of multiple preselected transition frame numbers can be obtained. Pre-analyze the video to be segmented, determine the area of interest of one or more video frames in the video to be segmented, and extract the first information of the area of interest; determine the detection template object based on the area of interest, and pre-select the transition frame The preselected transition frame corresponding to the number is detected for the area of interest to obtain the second information of the area of interest. According to the first total number of the first information and the second total number of the second information, the ratio of the second total number to the first total number is calculated. If it is determined that the ratio exceeds the preset first transition threshold, it means The preselected transition frame also contains many areas of interest, which do not meet the preset transition conditions, or the first weight value is obtained based on the sum of the weight values of the objects in each area of interest in the first information, and the first weight value is obtained according to the second information, determine each area of interest contained in the preselected transition frame, and accumulate the weight values of the objects in the corresponding area of interest to obtain a second weight value. Calculate the ratio of the second weight value to the first weight value. If it is determined that the ratio exceeds the preset second transition threshold, the preset transition condition is not met. For preselected transition frames that do not meet the preset transition conditions, delete the preselected transition frame number, obtain the next preselected transition frame number of the preselected transition frame number set, and continue traversing until the preselected transition frame number set is completed. The remaining pre-selected transition frame numbers in the transition frame number set are used for scene segmentation to ensure accurate scene segmentation.
图4示出了本申请一实施例提供的基于感兴趣区域的场景切分装置的结构示意图。如图4所示,该装置包括:Figure 4 shows a schematic structural diagram of a scene segmentation device based on regions of interest provided by an embodiment of the present application. As shown in Figure 4, the device includes:
感兴趣区域模块410,适于对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的第一信息;The region of interest module 410 is adapted to pre-analyze the video to be segmented, determine the region of interest of the video to be segmented, and extract the first information of the region of interest;
预切分模块420,适于对待切分视频进行场景预切分处理,得到至少一个预选转场帧;The pre-segmentation module 420 is adapted to perform scene pre-segmentation processing on the video to be segmented to obtain at least one preselected transition frame;
检测模块430,适于对预选转场帧进行感兴趣区域检测,得到感兴趣区域的第二信息;The detection module 430 is adapted to detect the region of interest on the preselected transition frame to obtain the second information of the region of interest;
匹配模块440,适于根据第一信息和第二信息进行匹配,以确定符合预设转场条件的转场帧来进行场景切分。The matching module 440 is adapted to perform matching according to the first information and the second information to determine transition frames that meet the preset transition conditions for scene segmentation.
可选地,感兴趣区域模块410进一步适于:Optionally, the region of interest module 410 is further adapted to:
对待切分视频进行预分析,确定待切分视频中视频帧的感兴趣区域,并提取得到视频帧中感兴趣区域的第一总个数及坐标信息;坐标信息包括感兴趣区域的角坐标信息、高度信息及宽度信息。Pre-analyze the video to be segmented, determine the area of interest of the video frame in the video to be segmented, and extract the first total number and coordinate information of the area of interest in the video frame; the coordinate information includes the angular coordinate information of the area of interest , height information and width information.
可选地,感兴趣区域对应的对象相同或者不同;Optionally, the objects corresponding to the regions of interest are the same or different;
感兴趣区域模块410进一步适于:统计视频帧中不同对象的感兴趣区域,得到感兴趣区域的第一总个数。The region of interest module 410 is further adapted to: count the regions of interest of different objects in the video frame to obtain a first total number of regions of interest.
可选地,感兴趣区域模块410进一步适于:Optionally, the region of interest module 410 is further adapted to:
对待切分视频进行预分析,确定待切分视频中一帧或者多帧视频帧的感兴趣区域。Pre-analyze the video to be segmented and determine the area of interest of one or more video frames in the video to be segmented.
可选地,装置还包括:模板图像模块450,适于根据感兴趣区域的坐标信息确定感兴趣区域对应的检测模板图像。Optionally, the device further includes: a template image module 450, adapted to determine a detection template image corresponding to the region of interest based on the coordinate information of the region of interest.
可选地,检测模块430进一步适于:Optionally, the detection module 430 is further adapted to:
根据检测模板图像对预选转场帧进行感兴趣区域检测,得到预选转场帧包含的感兴趣区域,并统计得到预选转场帧包含的感兴趣区域的第二总个数。Perform area-of-interest detection on the pre-selected transition frame based on the detection template image to obtain the area of interest included in the pre-selected transition frame, and obtain a second total number of areas of interest included in the pre-selected transition frame.
可选地,检测模块430进一步适于:Optionally, the detection module 430 is further adapted to:
根据检测模板图像对预选转场帧进行感兴趣区域检测,得到预选转场帧包含的多个感兴趣区域;Perform area of interest detection on the preselected transition frame based on the detection template image to obtain multiple areas of interest contained in the preselected transition frame;
统计预选转场帧中各个不同对象的感兴趣区域,得到感兴趣区域的第二总个数;其中,若多个感兴趣区域为同一对象,记录多个感兴趣区域的个数为1;Count the areas of interest of each different object in the preselected transition frame to obtain the second total number of areas of interest; where, if multiple areas of interest are the same object, the number of recorded multiple areas of interest is 1;
统计得到预选转场帧包含的各个感兴趣区域的第二总个数。The second total number of each region of interest contained in the preselected transition frame is obtained by statistics.
可选地,匹配模块440进一步适于:Optionally, the matching module 440 is further adapted to:
根据第一信息的第一总个数和第二信息的第二总个数,计算得到第二总个数与第一总个数的比值;Calculate the ratio of the second total number to the first total number based on the first total number of the first information and the second total number of the second information;
判断比值是否超过预设第一转场阈值;Determine whether the ratio exceeds the preset first transition threshold;
若是,则确定预选转场帧不符合预设转场条件。If so, it is determined that the preselected transition frame does not meet the preset transition conditions.
可选地,感兴趣区域模块410进一步适于:Optionally, the region of interest module 410 is further adapted to:
对待切分视频进行预分析,确定待切分视频的感兴趣区域,并提取得到感兴趣区域的的第一总个数、坐标信息及各个感兴趣区域的对象的权重值;Pre-analyze the video to be segmented, determine the area of interest of the video to be segmented, and extract the first total number of areas of interest, coordinate information, and weight values of objects in each area of interest;
匹配模块440进一步适于:Matching module 440 is further adapted to:
根据第一信息,计算各个感兴趣区域的对象的权重值之和,得到第一权重值;According to the first information, calculate the sum of the weight values of the objects in each area of interest to obtain the first weight value;
根据第二信息,确定预选转场帧中包含的感兴趣区域,将对应的感兴趣区域的对象的权重值累加,得到第二权重值;According to the second information, determine the area of interest contained in the preselected transition frame, and accumulate the weight values of the objects in the corresponding area of interest to obtain the second weight value;
计算第二权重值与第一权重值的比值,判断比值是否超过预设第二转场阈值;Calculate the ratio of the second weight value to the first weight value, and determine whether the ratio exceeds the preset second transition threshold;
若是,则确定预选转场帧不符合预设转场条件。If so, it is determined that the preselected transition frame does not meet the preset transition conditions.
可选地,预切分模块420进一步适于:Optionally, the pre-slicing module 420 is further adapted to:
基于场景检测,对待切分视频进行场景预切分处理,得到包含多个预选转场帧号的预选转场帧号集合;其中,预选转场帧号与预选转场帧一一对应。Based on scene detection, scene pre-segmentation is performed on the video to be segmented to obtain a set of preselected transition frame numbers containing multiple preselected transition frame numbers; among which, the preselected transition frame numbers correspond to the preselected transition frames one-to-one.
可选地,检测模块430和匹配模块440进一步适于:Optionally, the detection module 430 and the matching module 440 are further adapted to:
遍历所述预选转场帧号集合,对任一预选转场帧号对应的预选转场帧进行感兴趣区域检测,得到所述预选转场帧的感兴趣区域的第二信息;根据所述第一信息和所述第二信息进行匹配,判断所述预选转场帧是否为符合预设转场条件的转场帧;若否,删除所述预选转场帧号;获取所述预选转场帧号集合的下一预选转场帧号,继续遍历,直至遍历完成所述预选转场帧号集合,以根据所述预选转场帧号集合中剩余的预选转场帧号进行场景切分。Traverse the set of preselected transition frame numbers, perform area of interest detection on the preselected transition frame corresponding to any preselected transition frame number, and obtain the second information of the area of interest of the preselected transition frame; according to the first Match the first information with the second information to determine whether the preselected transition frame is a transition frame that meets the preset transition conditions; if not, delete the preselected transition frame number; obtain the preselected transition frame Continue traversing the next preselected transition frame number in the set of preselected transition frame numbers until the traversal completes the set of preselected transition frame numbers, so as to perform scene segmentation based on the remaining preselected transition frame numbers in the set of preselected transition frame numbers.
以上各模块的描述参照方法实施例中对应的描述,在此不再赘述。For the description of each module above, refer to the corresponding description in the method embodiment, and will not be described again here.
根据本申请提供的基于感兴趣区域的场景切分装置,对场景预切分得到多个预选转场帧号的预选转场帧号集合后,可以对预选转场帧号对应的转场帧进行感兴趣区域检测,利用检测的感兴趣区域的第二信息与预分析确定的感兴趣区域的第一信息进行判断,从而可以避免误检测,达到精准场景切分。According to the scene segmentation device based on the area of interest provided by this application, after pre-segmenting the scene to obtain a set of preselected transition frame numbers of multiple preselected transition frame numbers, the transition frames corresponding to the preselected transition frame numbers can be processed. Region of interest detection uses the second information of the detected region of interest and the first information of the region of interest determined by pre-analysis to make a judgment, thereby avoiding false detection and achieving accurate scene segmentation.
本申请还提供了一种非易失性计算机存储介质,计算机存储介质存储有至少一可执行指令,可执行指令可执行上述任意方法实施例中的基于感兴趣区域的场景切分方法。This application also provides a non-volatile computer storage medium. The computer storage medium stores at least one executable instruction. The executable instruction can execute the scene segmentation method based on the region of interest in any of the above method embodiments.
图5示出了根据本申请一实施例的一种计算设备的结构示意图,本申请的具体实施例并不对计算设备的具体实现做限定。Figure 5 shows a schematic structural diagram of a computing device according to an embodiment of the present application. The specific embodiment of the present application does not limit the specific implementation of the computing device.
如图5所示,该计算设备可以包括:处理器(processor)502、通信接口(Communications Interface)504、存储器(memory)506、以及通信总线508。As shown in FIG. 5 , the computing device may include: a processor 502 , a communications interface 504 , a memory 506 , and a communications bus 508 .
其中:in:
处理器502、通信接口504、以及存储器506通过通信总线508完成相互间的通信。The processor 502, the communication interface 504, and the memory 506 complete communication with each other through the communication bus 508.
通信接口504,用于与其它设备比如客户端或其它服务器等的网元通信。The communication interface 504 is used to communicate with network elements of other devices such as clients or other servers.
处理器502,用于执行程序510,具体可以执行上述基于感兴趣区域的场景切分方法实施例中的相关步骤。The processor 502 is configured to execute the program 510. Specifically, it may execute the relevant steps in the above-mentioned scene segmentation method embodiment based on the region of interest.
具体地,程序510可以包括程序代码,该程序代码包括计算机操作指令。Specifically, program 510 may include program code including computer operating instructions.
处理器502可能是中央处理器CPU,或者是特定集成电路ASIC(ApplicationSpecific Integrated Circuit),或者是被配置成实施本申请的一个或多个集成电路。计算设备包括的一个或多个处理器,可以是同一类型的处理器,如一个或多个CPU;也可以是不同类型的处理器,如一个或多个CPU以及一个或多个ASIC。The processor 502 may be a central processing unit (CPU), an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the present application. The one or more processors included in the computing device may be the same type of processor, such as one or more CPUs; or they may be different types of processors, such as one or more CPUs and one or more ASICs.
存储器506,用于存放程序510。存储器506可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。Memory 506 is used to store programs 510. The memory 506 may include high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
程序510具体可以用于使得处理器502执行上述任意方法实施例中的基于感兴趣区域的场景切分方法。程序510中各步骤的具体实现可以参见上述基于感兴趣区域的场景切分实施例中的相应步骤和单元中对应的描述,在此不赘述。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备和模块的具体工作过程,可以参考前述方法实施例中的对应过程描述,在此不再赘述。The program 510 may be specifically used to cause the processor 502 to execute the scene segmentation method based on the region of interest in any of the above method embodiments. For the specific implementation of each step in the program 510, please refer to the corresponding steps and corresponding descriptions in the units in the above-mentioned region-of-interest-based scene segmentation embodiment, and will not be described again here. Those skilled in the art can clearly understand that for the convenience and simplicity of description, the specific working processes of the above-described devices and modules can be referred to the corresponding process descriptions in the foregoing method embodiments, and will not be described again here.
在此提供的算法或显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本申请也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本申请的内容,并且上面对特定语言所做的描述是为了披露本申请的较佳实施方式。The algorithms or displays provided herein are not inherently associated with any particular computer, virtual system, or other device. Various general-purpose systems can also be used with teaching based on this. From the above description, the structure required to construct such a system is obvious. Furthermore, this application is not specific to any specific programming language. It should be understood that the subject matter described herein may be implemented using a variety of programming languages, and that the above descriptions of specific languages are for the purpose of disclosing preferred embodiments of the subject matter.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本申请的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the instructions provided here, a number of specific details are described. However, it is understood that embodiments of the present application may be practiced without these specific details. In some instances, well-known methods, structures, and techniques have not been shown in detail so as not to obscure the understanding of this description.
类似地,应当理解,为了精简本申请并帮助理解各个发明方面中的一个或多个,在上面对本申请的示例性实施例的描述中,本申请的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本申请要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本申请的单独实施例。Similarly, it will be understood that in the above description of exemplary embodiments of the present application, in order to streamline the present application and assist in understanding one or more of the various inventive aspects, various features of the present application are sometimes grouped together into a single embodiment, figure, or its description. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed application requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the Detailed Description are hereby expressly incorporated into this Detailed Description, with each claim standing on its own as a separate embodiment of this application.
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。Those skilled in the art will understand that modules in the devices in the embodiment can be adaptively changed and arranged in one or more devices different from that in the embodiment. The modules or units or components in the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All features disclosed in this specification (including accompanying claims, abstract and drawings) and any method so disclosed may be employed in any combination, except that at least some of such features and/or processes or units are mutually exclusive. All processes or units of the equipment are combined. Each feature disclosed in this specification (including accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
此外,本领域的技术人员能够理解,尽管在此的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本申请的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。Furthermore, those skilled in the art will understand that although some embodiments herein include certain features included in other embodiments but not others, combinations of features of different embodiments are meant to be within the scope of the present application. and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
本申请的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本申请的一些或者全部部件的一些或者全部功能。本申请还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本申请的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。Various component embodiments of the present application may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will understand that a microprocessor or digital signal processor (DSP) may be used in practice to implement some or all functions of some or all components according to the present application. The present application may also be implemented as an apparatus or device program (eg, computer program and computer program product) for performing part or all of the methods described herein. Such a program implementing the present application may be stored on a computer-readable medium, or may be in the form of one or more signals. Such signals may be downloaded from an Internet website, or provided on a carrier signal, or in any other form.
应该注意的是上述实施例对本申请进行说明而不是对本申请进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本申请可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。上述实施例中的步骤,除有特殊说明外,不应理解为对执行顺序的限定。It should be noted that the above-mentioned embodiments illustrate rather than limit the application, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The application may be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the element claim enumerating several means, several of these means may be embodied by the same item of hardware. The use of the words first, second, third, etc. does not indicate any order. These words can be interpreted as names. Unless otherwise specified, the steps in the above embodiments should not be understood as limiting the order of execution.
Claims (14)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310895397.4A CN116935275A (en) | 2023-07-19 | 2023-07-19 | Scene segmentation method and device based on area of interest |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310895397.4A CN116935275A (en) | 2023-07-19 | 2023-07-19 | Scene segmentation method and device based on area of interest |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116935275A true CN116935275A (en) | 2023-10-24 |
Family
ID=88392078
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310895397.4A Pending CN116935275A (en) | 2023-07-19 | 2023-07-19 | Scene segmentation method and device based on area of interest |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116935275A (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070183661A1 (en) * | 2006-02-07 | 2007-08-09 | El-Maleh Khaled H | Multi-mode region-of-interest video object segmentation |
| CN101072342A (en) * | 2006-07-01 | 2007-11-14 | 腾讯科技(深圳)有限公司 | Situation switching detection method and its detection system |
| EP2034426A1 (en) * | 2007-06-18 | 2009-03-11 | Sony (China) LTD | Moving image analyzing, method and system |
| CN115909219A (en) * | 2022-12-29 | 2023-04-04 | 深圳市诺龙技术股份有限公司 | Scene change detection method and system based on video analysis |
-
2023
- 2023-07-19 CN CN202310895397.4A patent/CN116935275A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070183661A1 (en) * | 2006-02-07 | 2007-08-09 | El-Maleh Khaled H | Multi-mode region-of-interest video object segmentation |
| CN101072342A (en) * | 2006-07-01 | 2007-11-14 | 腾讯科技(深圳)有限公司 | Situation switching detection method and its detection system |
| EP2034426A1 (en) * | 2007-06-18 | 2009-03-11 | Sony (China) LTD | Moving image analyzing, method and system |
| CN115909219A (en) * | 2022-12-29 | 2023-04-04 | 深圳市诺龙技术股份有限公司 | Scene change detection method and system based on video analysis |
Non-Patent Citations (2)
| Title |
|---|
| A CHERGUI等: "Video scene segmentation using the shot transition detection by local characterization of the points of interest", 2012 6TH INTERNATIONAL CONFERENCE ON SETIT, 21 March 2012 (2012-03-21) * |
| 方宏俊;宋利;杨小康;: "适配分辨率动态变化的低复杂度视频场景切换检测方法", 计算机科学, no. 02, 15 February 2017 (2017-02-15) * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5420199B2 (en) | Video analysis device, video analysis method, digest automatic creation system and highlight automatic extraction system | |
| CN105144239B (en) | Image processing apparatus, image processing method | |
| CN108810620B (en) | Method, apparatus, device and storage medium for identifying key time points in video | |
| US8326042B2 (en) | Video shot change detection based on color features, object features, and reliable motion information | |
| CN107833213B (en) | A Weakly Supervised Object Detection Method Based on False-True Value Adaptive Method | |
| US9311533B2 (en) | Device and method for detecting the presence of a logo in a picture | |
| CN111061898A (en) | Image processing method, device, computer equipment and storage medium | |
| CN111311475A (en) | Detection model training method and device, storage medium and computer equipment | |
| JP2000112997A (en) | How to automatically classify images into events | |
| CN108564579A (en) | A kind of distress in concrete detection method and detection device based on temporal and spatial correlations | |
| CN110516572B (en) | Method for identifying sports event video clip, electronic equipment and storage medium | |
| Mustamo | Object detection in sports: TensorFlow Object Detection API case study | |
| CN114445768A (en) | Target identification method and device, electronic equipment and storage medium | |
| CN109460724B (en) | Object detection-based separation method and system for ball-stopping event | |
| CN109871792B (en) | Pedestrian detection method and device | |
| CN111429341A (en) | Video processing method, video processing equipment and computer readable storage medium | |
| CN112381054A (en) | Method for detecting working state of camera and related equipment and system | |
| CN110472561B (en) | Soccer goal type identification method, device, system and storage medium | |
| CN115937263B (en) | Vision-based target tracking method, system, electronic device and storage medium | |
| WO2018058573A1 (en) | Object detection method, object detection apparatus and electronic device | |
| CN105745598A (en) | Determine the shape of a representation of an object | |
| US12094183B2 (en) | Geometric pattern matching method and device for performing the method | |
| CN112541428B (en) | Football recognition method, football recognition device and robot | |
| CN116935275A (en) | Scene segmentation method and device based on area of interest | |
| CN114550062A (en) | Method and device for determining moving object in image, electronic equipment and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |