WO2023125028A1

WO2023125028A1 - System for improving positioning precision of pan-tilt camera and control method therefor

Info

Publication number: WO2023125028A1
Application number: PCT/CN2022/139161
Authority: WO
Inventors: 殷锐; 袁建涛; 万安平; 沈熠能; 方春燕; 孙鹿阳; 林郑一雯; 王晨阳; 张文斌
Original assignee: Zhejiang University City College ZUCC
Current assignee: Hangzhou City University
Priority date: 2021-12-27
Filing date: 2022-12-15
Publication date: 2023-07-06
Anticipated expiration: 2024-06-27
Also published as: GB2619136A; CN113989124A; CN113989124B

Abstract

The invention relates to the technical field of computer vision in a remote monitoring system. The invention relates to a control method for improving the positioning precision of a pan-tilt camera. The control method comprises: firstly, initialization is performed; a panoramic image of an application scenario is displayed on a display screen of a computing platform, a user selects a region of interest in the panoramic image of the application scenario, an image of the region of interest is stored as a target image, and the user selects one target image from a plurality of target images as a target to be rotated. The beneficial effects of the invention are that: control is performed on a software layer, so there is no need to modify the hardware elements of a traditional camera, the target can be accurately positioned through software operation even if the rotating axis of a camera is inaccurate, and there is no need to replace camera devices. Thus, hardware deployment costs can be greatly reduced, and the service life of camera hardware is prolonged. The present invention allows the user to balance the speed and precision of panorama composition and matching calculations based on the actual situation. The present invention is highly flexible.

Description

A system for improving the positioning accuracy of a pan-tilt camera and its control method

本申请要求于2021年12月27日提交中国专利局、申请号为202111608415.3、发明名称为“一种提升云台摄像机定位精度的系统及其控制方法”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application submitted to the China Patent Office on December 27, 2021, with the application number 202111608415.3, and the title of the invention is "A System and Control Method for Improving the Positioning Accuracy of PTZ Cameras", the entire content of which Incorporated in this application by reference.

technical field

本发明属于远程监控系统中的计算机视觉技术领域，尤其涉及一种提升云台摄像机定位精度的系统及其控制方法。The invention belongs to the technical field of computer vision in a remote monitoring system, and in particular relates to a system for improving the positioning accuracy of a pan-tilt camera and a control method thereof.

Background technique

目前，城市安全、监控设备的运用已经渗透到各行各业，中国政府持续地加大在监控安全的人力、物力和财力投入，为建设“和谐社会、平安城市”的口号而努力。但是传统的监控摄像机为枪型摄像机，其监控位置固定，导致监控范围有限。为解决此问题，出现了云台摄像机。监控人员可以通过操作键盘来调节云台摄像机转动，并且多数云台摄像头提供预置点功能来实现安全防卫。预置点功能是云台摄像机保存当前电机的机械位置，用户可以在任意位置重新转向该位置。预置点通过记录下云台的俯仰角、偏转角，镜头的焦距、相机的光圈、曝光、白平衡等拍照参数，储存在相机内部的SD卡上，并被赋予一个编号以供用户索引。用户可以让相机根据指定编号的预置点，将自身拍照参数恢复到预置点记录的参数值，以实现拍摄预定区域的画面，预置点功能极大地方便了监控人员进行检查。At present, the use of urban security and monitoring equipment has penetrated into all walks of life. The Chinese government continues to increase human, material and financial investment in monitoring security, and strives to build a "harmonious society, safe city" slogan. However, the traditional surveillance camera is a bullet camera, and its monitoring position is fixed, resulting in a limited monitoring range. To solve this problem, PTZ cameras have appeared. Surveillance personnel can adjust the rotation of the PTZ camera by operating the keyboard, and most PTZ cameras provide a preset point function to achieve security defense. The preset point function is to save the current mechanical position of the motor for the PTZ camera, and the user can turn to this position again at any position. The preset point is stored on the SD card inside the camera by recording the pitch angle, deflection angle of the pan/tilt, focal length of the lens, camera aperture, exposure, white balance and other camera parameters, and is given a number for user indexing. The user can let the camera restore its own camera parameters to the parameter values recorded in the preset point according to the preset number of the specified number, so as to realize the picture of the predetermined area. The preset point function greatly facilitates the monitoring personnel to check.

然而，预置点功能需要手动进行配置，其以机械位置作为转向标准的方法，决定了预置点功能并不能对现场环境做完备的覆盖，当监控人员想要查看出预置点以外的区域就需要手动控制云台转动，这就对监控人员实时掌握现场情况造成了阻碍。其外，由于电机长期运转导致的机械老化，预置点功能在长期使用后，往往不再能准确地旋转到用户感兴趣/关注的区域。另外，由于摄像头转轴的老化，导致传统机械转动方法在摄像头使用一段周期后，转动至用户关注点位置时不再准确，需要频繁更换摄像头，增加了硬件成本。However, the preset point function needs to be manually configured. The method of using the mechanical position as the steering standard determines that the preset point function cannot completely cover the on-site environment. When the monitoring personnel want to view the area other than the preset point It is necessary to manually control the rotation of the pan-tilt, which hinders the monitoring personnel from grasping the situation on the spot in real time. In addition, due to the mechanical aging caused by the long-term operation of the motor, the preset point function often no longer can accurately rotate to the area of interest/focus of the user after long-term use. In addition, due to the aging of the camera shaft, the traditional mechanical rotation method is no longer accurate when the camera is used for a period of time to rotate to the position of the user's attention. Frequent replacement of the camera is required, which increases the hardware cost.

发明内容Contents of the invention

本发明的目的是克服现有技术中的不足，提供一种提升云台摄像机定位精度的系统及其控制方法。The purpose of the invention is to overcome the deficiencies in the prior art, and provide a system and a control method thereof for improving the positioning accuracy of the pan-tilt camera.

为达上述目的，本发明提供了如下技术方案：For reaching above-mentioned purpose, the present invention provides following technical scheme:

一种提升云台摄像机定位精度的控制方法，包括如下步骤：A control method for improving the positioning accuracy of a pan-tilt camera, comprising the steps of:

步骤101、首先进行初始化：计算平台和云台摄像机均上电，等待计算平台和云台摄像机自检完成后，计算平台基于ffmpeg库来逐帧调取云台摄像机的网络视频流序列；然后计算平台控制云台摄像机采集所有应用场景的图像信息，并基于图像拼接方法，构造应用场景的全景图；Step 101, first perform initialization: both the computing platform and the PTZ camera are powered on, and after waiting for the self-test of the computing platform and the PTZ camera to complete, the computing platform calls the network video stream sequence of the PTZ camera frame by frame based on the ffmpeg library; then calculates The platform controls the PTZ camera to collect image information of all application scenarios, and constructs a panorama of the application scenarios based on the image stitching method;

步骤102、计算平台的显示屏幕上显示出应用场景的全景图，用户在应用场景的全景图中选择其感兴趣的区域，将感兴趣区域的图像保存为目标图像，用户在多个目标图像中选择一个目标图像作为待转向的目标；Step 102, the display screen of the computing platform displays a panorama of the application scene, the user selects an area of interest in the panorama of the application scene, and saves the image of the area of interest as a target image, and the user selects an area of interest in a plurality of target images Select a target image as the target to be turned;

步骤103、粗略查找：计算平台接收用户提供的目标图像，FLANN匹配算法获得目标图像与当前图像的坐标，计算目标图像与当前图像的坐标差值；若坐标差值不满足阈值条件，则根据坐标差值决定云台摄像机的转动方向与转动距离，并控制云台摄像机按所得转动方向与转动距离进行转动；若坐标差值满足阈值条件，则不进行转动；重复执行步骤103，直至所得目标图像与当前图像的坐标差值满足阈值条件；Step 103, rough search: the calculation platform receives the target image provided by the user, the FLANN matching algorithm obtains the coordinates of the target image and the current image, and calculates the coordinate difference between the target image and the current image; if the coordinate difference does not meet the threshold condition, then according to the coordinate The difference determines the rotation direction and rotation distance of the PTZ camera, and controls the PTZ camera to rotate according to the obtained rotation direction and rotation distance; if the coordinate difference satisfies the threshold condition, the rotation is not performed; repeat step 103 until the obtained target image The coordinate difference with the current image satisfies the threshold condition;

步骤104、精确寻找：计算平台选取两张位置相距较近且与目标图像匹配点距离近的图像，提取特征点；利用透视变换方法将两张图像在同一坐标系中进行拼接，获得精确的横坐标差值Δx _p和精确的纵坐标差值Δy _p；若横坐标差值和纵坐标差值满足阈值条件，则计算平台通过云台摄像机的串口发送控制命令，云台摄像机根据计算出的精确距离和方向持续转动，转动时间为一个固定时间窗长T _s；若横坐标差值和纵坐标差值不满足阈值条件，则重复步骤104，直至两张图像的距离小于设定的阈值R _s时，结束图像匹配；精准定位并输出待识别目标所在位置的经纬度，供控制电路将摄像机转动至用户选择的目标区域。 Step 104. Precise search: the computing platform selects two images that are close to each other and the matching point of the target image, and extracts feature points; use the perspective transformation method to splice the two images in the same coordinate system to obtain an accurate horizontal The coordinate difference Δx _p and the precise ordinate difference Δy _p ; if the abscissa difference and the ordinate difference meet the threshold condition, the computing platform sends a control command through the serial port of the pan-tilt camera, and the pan-tilt camera The distance and direction continue to rotate, and the rotation time is a fixed time window length T _s ; if the difference in abscissa and ordinate does not meet the threshold condition, repeat step 104 until the distance between the two images is less than the set threshold R _s , the image matching is ended; the precise positioning and output of the latitude and longitude of the location of the target to be identified are used for the control circuit to rotate the camera to the target area selected by the user.

作为优选，所述云台摄像机为适配网络视频协议如RTSP(Real Time Streaming Protocol)协议的摄像头；所述计算平台与所述云台摄像机通过有线或无线的方式通信。Preferably, the PTZ camera is a camera adapted to a network video protocol such as RTSP (Real Time Streaming Protocol) protocol; the computing platform communicates with the PTZ camera in a wired or wireless manner.

作为优选，所述计算平台控制云台摄像机采集所有应用场景的图像信息的具体方式为：As preferably, the computing platform controls the pan-tilt camera to collect the image information of all application scenarios in a specific manner as follows:

计算平台通过云台摄像机的串口发送控制转动命令，摄像机向左或者向右转动一个固定时间窗长T _q后停止；在不同的场景下，用户调节平衡因子σ来平衡全景图片生成的速度与成功率，平衡公式为： The computing platform sends control rotation commands through the serial port of the PTZ camera, and the camera turns left or right for a fixed time window length T _q and then stops; in different scenarios, the user adjusts the balance factor σ to balance the speed of panorama image generation and The success rate, the balance formula is:

A＝1-σ,T _q＝σT (1) A=1-σ,T _q =σT (1)

其中，A表示的是拼接成功率，T表示一个固定时间段，T _q为固定时间窗长；当σ＝1时，获取图像的速度最快，成功率最低；当σ＝0时，获取图像的速度最慢，成功率最高； Among them, A represents the splicing success rate, T represents a fixed time period, and T _q is the fixed time window length; when σ=1, the image acquisition speed is the fastest and the success rate is the lowest; when σ=0, the image acquisition The slowest speed, the highest success rate;

在一次转动停止后，计算平台提取实时的视频帧，并将该视频帧保存为图片；摄像机持续提取视频帧并保存为图片，直到所有的工作场景信息均被摄像机保存至计算平台。After a rotation stops, the computing platform extracts the real-time video frame and saves the video frame as a picture; the camera continues to extract the video frame and save it as a picture until all the working scene information is saved to the computing platform by the camera.

作为优选，所述图像拼接方法为基于OpenCV中的Stitching方法，计算平台使用保存的场景信息图像进行拼接，形成应用场景的全景图。Preferably, the image stitching method is based on the Stitching method in OpenCV, and the computing platform uses the saved scene information images to stitch to form a panorama of the application scene.

作为优选，步骤103具体包括以下步骤：Preferably, step 103 specifically includes the following steps:

步骤103-1、计算平台接收到用户提供的目标图像后，基于FLANN匹配算法，将目标图像与全景图像进行匹配，FLANN匹配算法返回所有特征相匹配的目标点坐标(X,Y)；比对相邻的两个目标点坐标，若相邻两个目标点坐标的距离超过原图的长度或宽度，则将该目标点重新定义成不匹配点；Step 103-1. After receiving the target image provided by the user, the computing platform matches the target image with the panoramic image based on the FLANN matching algorithm, and the FLANN matching algorithm returns the coordinates (X, Y) of the target point matching all the features; comparison The coordinates of two adjacent target points, if the distance between the coordinates of two adjacent target points exceeds the length or width of the original image, then redefine the target point as a mismatch point;

步骤103-2、遍历所有匹配点，找出匹配点所在区域的四个顶点：x轴最小与y轴最小点

x轴最小与y轴最大点

x轴最大与y轴最小点

x轴最大与y轴最大点

若匹配点所在区域的宽度或长度大于原图大小，则舍弃x轴或y轴最边上的两个匹配点，重新构造四个顶点，直到图片尺寸小于等于云台摄像机获取的图片尺寸；对于x轴，若max(X)-min(X)大于原图的宽度，则在匹配的目标点坐标(X,Y)中丢弃x轴坐标中含有max(X),min(X)的所有坐标；对于y轴，若max(Y)-min(Y)大于原图的长度，则在匹配的目标点坐标(X,Y)中丢弃y轴坐标中含有max(Y),min(Y)的所有坐标； Step 103-2. Traverse all matching points and find out the four vertices in the area where the matching points are located: the minimum point on the x-axis and the minimum point on the y-axis

x-axis minimum and y-axis maximum points

x-axis maximum and y-axis minimum point

x-axis maximum and y-axis maximum point

If the width or length of the area where the matching point is located is greater than the size of the original image, discard the two matching points on the extreme side of the x-axis or y-axis, and reconstruct four vertices until the size of the image is smaller than or equal to the size of the image obtained by the PTZ camera; for x-axis, if max(X)-min(X) is greater than the width of the original image, then discard all coordinates containing max(X), min(X) in the x-axis coordinates in the matched target point coordinates (X,Y) ;For the y-axis, if max(Y)-min(Y) is greater than the length of the original image, then discard the coordinates of the y-axis that contain max(Y), min(Y) in the matched target point coordinates (X,Y) all coordinates;

最后根据四个顶点计算出匹配图像的中点

Finally, calculate the midpoint of the matching image based on the four vertices

步骤103-3、计算平台读取当前视频帧，将当前帧和目标图像分别与应用场景的全景图作比对，得出当前图像匹配中点

与目标图像匹配中点

Step 103-3. The computing platform reads the current video frame, compares the current frame and the target image with the panorama of the application scene, and obtains the matching midpoint of the current image

Match the midpoint with the target image

步骤103-4、根据当前图像匹配中点与目标图像匹配中点在x轴方向的距离

以及在y轴方向的距离

来决定摄像机的转动方向；计算平台通过云台摄像机的串口进行发送控制命令，使云台摄像机以速度θ _c持续转动，摄像机每次的转动时间为一个固定时间窗长T _c；转动一次完成后，计算平台再次读取当前视频帧，重新计算匹配图像的中点以及与目标图像的之间的坐标差值，若坐标差值大于等于预设的精度阈值R _c，则重复步骤103-1至步骤103-4，直到两匹配中点的距离小于设定阈值R _c。 Step 103-4, according to the distance between the current image matching midpoint and the target image matching midpoint in the x-axis direction

and the distance in the y-axis direction

to determine the rotation direction of the camera; the computing platform sends control commands through the serial port of the pan-tilt camera, so that the pan-tilt camera continues to rotate at a speed θ _c , and the rotation time of the camera is a fixed time window length T _c each time; after one rotation is completed , the computing platform reads the current video frame again, recalculates the coordinate difference between the midpoint of the matching image and the target image, and if the coordinate difference is greater than or equal to the preset accuracy threshold R _c , repeat steps 103-1 to Step 103-4, until the distance between the two matching midpoints is less than the set threshold R _c .

作为优选，步骤103和步骤104中提出的粗略查找和精细查找中，自定义设定转动时间窗口：As a preference, in the rough search and the fine search proposed in step 103 and step 104, the rotation time window is customized:

T _c＝αT(4) _Tc = αT(4)

T _s＝βT(5) T _s =βT(5)

其中，α，β∈[0,1]，用于辅助用户灵活调整门限值；当α＝1,β＝0时，达到最大值，每次旋转时间长，计算次数少，匹配速度快；当α＝0,β＝0时，T _c＝T _s＝T _min，T _min为设定的最小转动时间窗口。 Among them, α, β∈[0,1] are used to assist users to adjust the threshold value flexibly; when α=1, β=0, it reaches the maximum value, each rotation takes a long time, the number of calculations is small, and the matching speed is fast; When α=0, β=0, T _c =T _s =T _min , where T _min is the set minimum rotation time window.

作为优选，步骤103和步骤104中计算平台控制云台摄像机按所得转动方向与转动距离进行转动，转动方向有以下8种情况：As a preference, in step 103 and step 104, the computing platform controls the pan-tilt camera to rotate according to the obtained rotation direction and rotation distance, and the rotation direction has the following 8 situations:

当Δx＞R，-R≤Δy≤R时，云台摄像机以速度θ _c向右转动； When Δx>R, -R≤Δy≤R, the pan-tilt camera rotates to the right at the speed θ _c ;

当Δx＜R，-R≤Δy≤R时，云台摄像机以速度θ _c向左转动； When Δx<R, -R≤Δy≤R, the pan-tilt camera rotates to the left at the speed θ _c ;

当-R≤Δx≤R，Δy＞R时，云台摄像机以速度θ _c向上转动； When -R≤Δx≤R, Δy>R, the pan-tilt camera rotates upward at the speed θ _c ;

当-R≤Δx≤R，Δy＜R时，云台摄像机以速度θ _c向下转动； When -R≤Δx≤R, Δy<R, the pan-tilt camera rotates downward at the speed θ _c ;

当Δx＞R，Δy＞R时，云台摄像机以速度θ _c向右上方转动； When Δx>R, Δy>R, the pan-tilt camera rotates to the upper right at the speed θ _c ;

当Δx＞R，Δy≤-R时，云台摄像机以速度θ _c向右下方转动； When Δx>R, Δy≤-R, the PTZ camera rotates downward and rightward at the speed θ _c ;

当Δx≤-R，Δy＞R时，云台摄像机以速度θ _c向左上方转动； When Δx≤-R, Δy>R, the pan-tilt camera rotates to the upper left at the speed θ _c ;

当Δx≤-R，Δy≤-R时，云台摄像机以速度θ _c向左下方转动； When Δx≤-R, Δy≤-R, the pan-tilt camera rotates to the lower left at the speed θ _c ;

其中，R>0，R为预设的精度阈值R _c或R _s；在粗略查找时，当摄像头的坐标与目标坐标的差值小于等于R _c时，停止粗略查找；在精确匹配时，当摄像头的坐标与目标坐标的差值小于等于R _s时，停止精确匹配。 Among them, R>0, R is the preset accuracy threshold R _c or R _s ; in rough search, when the difference between the camera coordinates and the target coordinates is less than or equal to R _c , stop the rough search; in exact match, when When the difference between the coordinates of the camera and the coordinates of the target is less than or equal to R _s , stop the precise matching.

一种提升云台摄像机定位精度系统，包括：A system for improving the positioning accuracy of a pan-tilt camera, comprising:

云台摄像机，用于旋转摄像；PTZ camera for rotating camera;

蓄电池模块和电源模块，用于向计算平台和云台摄像机供电；The battery module and the power module are used to supply power to the computing platform and the PTZ camera;

计算平台，用于与云台摄像机通信并控制云台摄像机转动，包括：The computing platform is used to communicate with the PTZ camera and control the rotation of the PTZ camera, including:

用于逐帧调取云台摄像机的网络视频流序列和接收用户提供的目标图像的信息接收装置；An information receiving device for calling the network video stream sequence of the PTZ camera frame by frame and receiving the target image provided by the user;

用于控制云台摄像机采集所有应用场景的图像信息和控制云台摄像机进行转动的控制装置；A control device for controlling the pan-tilt camera to collect image information of all application scenarios and controlling the pan-tilt camera to rotate;

用于显示应用场景并供用户选择目标图像的显示装置；A display device for displaying application scenarios and allowing users to select target images;

用于进行图像特征提取和图像拼接，并获取目标图像和当前图像的坐标差值的数据处理装置；A data processing device for extracting image features and stitching images, and obtaining the coordinate difference between the target image and the current image;

用于将实时提取的视频帧保存为图片的数据存储装置。A data storage device for saving real-time extracted video frames as pictures.

本发明的有益效果是：The beneficial effects of the present invention are:

针对现有技术中的预置点功能需要手动进行配置和摄像头转轴的老化需要频繁更换摄像头的问题，本发明提出了一种提升云台摄像机定位精度的系统及其控制方法，该系统通过对用户关心的场景构建全景图像，更清楚地把握整个场景的信息；并依据此全景图像，与用户提供给匹配算法的目标图像、当前摄像机监控图像比对，算法控制摄像机能从任意位置准确、清晰地转向用户提供的目标图像位置处，完成对感兴趣区域的监控。解决了传统预置点功能不方便、不准确的问题，使得监控系统更加科学化。Aiming at the problem that the preset point function in the prior art needs to be manually configured and the aging of the camera shaft requires frequent replacement of the camera, the present invention proposes a system and a control method for improving the positioning accuracy of the pan-tilt camera. Construct a panoramic image of the scene you care about, and grasp the information of the entire scene more clearly; and based on this panoramic image, compare it with the target image provided by the user to the matching algorithm and the current camera monitoring image, the algorithm controls the camera to accurately and clearly view the scene from any position Turn to the target image position provided by the user to complete the monitoring of the region of interest. It solves the inconvenient and inaccurate problems of the traditional preset function, and makes the monitoring system more scientific.

此外，本发明的方法从软件层面进行控制，无需对传统摄像头的硬件部分进行修改，即使摄像头转动轴不精确也可通过软件操作精确定位目标，无需更换摄像头设备，故可以极大的节约硬件部署成本，还延长了硬件摄像头的使用寿命。本发明允许用户根据实际情况对全景构造与匹配算法的速度与精度进行平衡，灵活性强。In addition, the method of the present invention is controlled from the software level without modifying the hardware part of the traditional camera. Even if the rotation axis of the camera is inaccurate, the target can be precisely positioned through software operation, and there is no need to replace the camera equipment, so the hardware deployment can be greatly saved Cost, but also prolong the service life of the hardware camera. The invention allows users to balance the speed and precision of panorama construction and matching algorithms according to actual conditions, and has strong flexibility.

说明书附图Instructions attached

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the accompanying drawings required in the embodiments. Obviously, the accompanying drawings in the following description are only some of the present invention. Embodiments, for those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.

图1为本发明所应用的一种提升云台摄像机定位精度的系统架构示意图；Fig. 1 is a schematic diagram of a system architecture for improving the positioning accuracy of a pan-tilt camera applied in the present invention;

图2为本发明所提供的一种提升云台摄像机定位精度的系统控制方法的工作流程图；Fig. 2 is a work flow diagram of a system control method for improving the positioning accuracy of a pan-tilt camera provided by the present invention;

图3为本发明所提供的匹配目标图像坐标第一阶段示意图；3 is a schematic diagram of the first stage of matching target image coordinates provided by the present invention;

图4为本发明所提供的匹配目标图像坐标第二阶段示意图；4 is a schematic diagram of the second stage of matching target image coordinates provided by the present invention;

图5为本发明实施例中在校园室外环境中对不同目标图像测试的结果图；Fig. 5 is the result figure of testing different target images in the campus outdoor environment in the embodiment of the present invention;

图6为本发明实施例中在校园室内环境中对不同目标图像测试的结果图；FIG. 6 is a result diagram of testing different target images in an indoor campus environment in an embodiment of the present invention;

图7为本发明实施例中在近似工厂室内环境中对不同目标图像测试的结果图。FIG. 7 is a diagram showing test results of different target images in an approximate factory indoor environment in an embodiment of the present invention.

Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

实施例一Embodiment one

如图1所示，本实施例提供了一种提升云台摄像机定位精度的系统，包括：云台摄像机、计算平台、电源模块和蓄电池；云台摄像机用于旋转摄像；蓄电池模块和电源模块，用于向计算平台和云台摄像机供电；计算平台用于与云台摄像机通信并控制云台摄像机转动，功能包括：用于逐帧调取云台摄像机的网络视频流序列和接收用户提供的目标图像的信息接收装置，用于控制云台摄像机采集所有应用场景的图像信息和控制云台摄像机进行转动的控制装置，用于显示应用场景并供用户选择目标图像的显示装置，用于进行图像特征提取和图像拼接，并获取目标图像和当前图像的坐标差值的数据处理装置，用于将实时提取的视频帧保存为图片的数据存储装置。As shown in Figure 1, the present embodiment provides a system for improving the positioning accuracy of a pan-tilt camera, including: a pan-tilt camera, a computing platform, a power supply module and a battery; the pan-tilt camera is used for rotating camera; a battery module and a power supply module, It is used to supply power to the computing platform and the PTZ camera; the computing platform is used to communicate with the PTZ camera and control the rotation of the PTZ camera. The image information receiving device is used to control the pan-tilt camera to collect image information of all application scenes and the control device to control the pan-tilt camera to rotate. It is used to display the application scene and the display device for the user to select the target image. It is used to perform image characteristics A data processing device for extracting and image splicing, and obtaining the coordinate difference between the target image and the current image, and a data storage device for saving the video frames extracted in real time as pictures.

实施例二Embodiment two

在实施例一的基础上，本申请实施例二提供了实施例一中提升云台摄像机定位精度的系统控制方法的工作流程图，如图2所示。基于常用且正常运行的计算平台，并且利用FLANN算法以及图像透视变换方法结合的云台摄像机高精度匹配方法；云台摄像机持续与环境进行交互，并且传输网络视频流至计算平台，计算平台在屏幕上给予显示，用户选择其感兴趣的区域，并将该区域的图像提供给匹配算法，随后匹配算法控制摄像机精确转向用户感兴趣的区域。On the basis of Embodiment 1, Embodiment 2 of the present application provides a work flow chart of the system control method for improving the positioning accuracy of the pan-tilt camera in Embodiment 1, as shown in FIG. 2 . Based on a commonly used and normally operating computing platform, and using the FLANN algorithm and the image perspective transformation method combined with the high-precision matching method of the PTZ camera; the PTZ camera continuously interacts with the environment, and transmits the network video stream to the computing platform, and the computing platform is displayed on the screen. On the display, the user selects the area of interest, and the image of the area is provided to the matching algorithm, and then the matching algorithm controls the camera to precisely turn to the area of interest of the user.

本发明的控制方法包括步骤101至步骤104。具体过程请参见以下详细介绍。The control method of the present invention includes step 101 to step 104 . Please refer to the detailed introduction below for the specific process.

步骤101、图像拼接构造全景图。Step 101, image stitching to construct a panorama.

具体的，计算平台通过云台摄像机的串口进行发送控制转动命令，使摄像机向左(或者向右，视具体工作场景情况而定)转动一个固定时间窗长T _q后停止。T表示一个固定时间段，而σ是平衡因子，在不同的场景下，用户可以调节σ以平衡全景图片生成的速度与成功率，平衡公式可表达为： Specifically, the computing platform sends a control rotation command through the serial port of the pan-tilt camera, so that the camera rotates to the left (or to the right, depending on the specific working scene) for a fixed time window length T _q and then stops. T represents a fixed time period, and σ is a balance factor. In different scenarios, users can adjust σ to balance the speed and success rate of panorama image generation. The balance formula can be expressed as:

A＝1-σ,T _q＝σT (1) A=1-σ,T _q =σT (1)

其中，A表示的是拼接成功率。当σ＝1时，意味着获取图像的速度最快，但是相对的成功率也最低。当σ＝0时，意味着获取图像的速度最慢，但是相对的成功率也最高。用户可以根据目标场景与特定需求对平衡因子进行调整。Among them, A represents the splicing success rate. When σ=1, it means that the image acquisition speed is the fastest, but the relative success rate is also the lowest. When σ=0, it means that the image acquisition speed is the slowest, but the relative success rate is also the highest. Users can adjust the balance factor according to the target scene and specific needs.

在一次转动停止后计算平台提取实时的视频帧，并在计算平台保存该图片。摄像机持续此动作，直到所需要全部的工作场景信息已被摄像机保存至计算平台。随后基于OpenCV中的Stitching方法，计算平台使用保存的场景信息图像进行拼接，形成全景图片。After a rotation stops, the computing platform extracts the real-time video frame, and saves the picture on the computing platform. The camera continues this action until all required work scene information has been saved by the camera to the computing platform. Then, based on the Stitching method in OpenCV, the computing platform uses the saved scene information images to stitch together to form a panoramic picture.

步骤102、用户选定感兴趣区域。Step 102, the user selects an area of interest.

具体地，计算平台将全景图像提供给用户，用户可以提前将工作场景中的任一区域作为感兴趣区域，并将感兴趣区域的图像保存为目标图像，随后，用户可以在多个目标图像中选择一个目标图像作为需要转向的目标。Specifically, the computing platform provides the panoramic image to the user, and the user can set any area in the working scene as the area of interest in advance, and save the image of the area of interest as the target image, and then the user can Select a target image as the target to turn to.

步骤103、粗略寻找：FLANN算法获得目标与当前图像的坐标，且根据坐标差值，控制摄像头转动。再次计算坐标差值，若不满足阈值条件则重复步骤103，直到满足阈值条件。Step 103 , rough search: the FLANN algorithm obtains the coordinates of the target and the current image, and controls the rotation of the camera according to the coordinate difference. The coordinate difference is calculated again, and if the threshold condition is not met, step 103 is repeated until the threshold condition is met.

具体地，计算平台接收到用户目标图像后，基于FLANN匹配算法，将目标图像与全景图像进行匹配。FLANN会返回所有特征相匹配的目标点坐标(X,Y)。首先，考虑到工业场景下会有许多类似的设备或物品等，FLANN很容易将两个相距较远但特征类似的特征点匹配起来。此时，比对相邻的两点，若距离超过原图的长度(或宽度)，则将该点重新定义成不匹配点。然后，遍历所有匹配点，找出四个顶点，即x轴最小与y轴最小点

x轴最小与y轴最大点

x轴最大与y轴最小点

x轴最大与y轴最大点

若宽度或长度大于原图大小，则舍弃x轴(或y轴)最边上两个匹配点，重新构造四个顶点，直到图片大小小于等于摄像机图片大小。最后根据四个顶点计算出匹配图像的中点

Specifically, after receiving the target image of the user, the computing platform matches the target image with the panoramic image based on the FLANN matching algorithm. FLANN returns the coordinates (X,Y) of the target point where all features match. First of all, considering that there will be many similar equipment or items in the industrial scene, FLANN can easily match two feature points that are far apart but have similar characteristics. At this time, compare two adjacent points, if the distance exceeds the length (or width) of the original image, then redefine the point as a mismatch point. Then, traverse all matching points to find four vertices, namely the minimum x-axis and y-axis minimum points

x-axis minimum and y-axis maximum points

x-axis maximum and y-axis minimum point

x-axis maximum and y-axis maximum point

If the width or length is greater than the size of the original image, discard the two matching points on the edge of the x-axis (or y-axis), and reconstruct four vertices until the image size is smaller than or equal to the camera image size. Finally, calculate the midpoint of the matching image based on the four vertices

进一步地，查找阶段先进行粗略查找。计算平台读取当前视频帧，将当前帧和目标图像分别与全景图片作比对，当前图像匹配中点

与目标图像匹配中点

根据两点x轴的距离

以及y轴的距离

来决定摄像机的转动方向。 Further, in the search phase, a rough search is performed first. The computing platform reads the current video frame, compares the current frame and the target image with the panoramic image, and the current image matches the midpoint

Match the midpoint with the target image

According to the distance between two points on the x-axis

and the distance on the y-axis

To determine the direction of rotation of the camera.

计算平台通过云台摄像机的串口进行发送控制命令，使摄像机以速度θ _c持续转动，转动方向有以下8种情况： The computing platform sends control commands through the serial port of the PTZ camera, so that the camera continues to rotate at the speed θ _c , and the rotation direction has the following 8 situations:

其中，R>0，是预设的精度阈值R _c，R _s其中之一。 Wherein, R>0 is one of the preset accuracy thresholds R _c and R _s .

在该阶段，摄像机每次的转动时间为一个固定时间窗长T _c。转动一次完成后，计算平台再次读取当前视频帧，重新计算匹配图像的中点以及与目标图像的之间的坐标差值，若不满足预设的精度阈值，则重复该粗略阶段的操作，直到两匹配中点的距离小于设定阈值R _c。 At this stage, the rotation time of the camera each time is a fixed time window length T _c . After one rotation is completed, the computing platform reads the current video frame again, and recalculates the coordinate difference between the midpoint of the matching image and the target image. If the preset accuracy threshold is not met, the operation of this rough stage is repeated. Until the distance between the two matching midpoints is less than the set threshold R _c .

步骤104，精确寻找：提取特征点，并做图像透视变换获得精确坐标差值，且根据坐标差值，控制摄像头转动。再次计算坐标差值，若不满足阈值条件则重复步骤104，直到满足阈值条件。Step 104, precise search: extract feature points, perform image perspective transformation to obtain precise coordinate difference, and control camera rotation according to coordinate difference. The coordinate difference is calculated again, and if the threshold condition is not met, step 104 is repeated until the threshold condition is met.

具体地，当前视频帧与目标图像接近但没有完全匹配，FLANN在这样的情况下无法完成像素级的精确匹配功能。计算平台提取两张图像的特征点，由于图片相距较近，匹配点距离近，利用透视变化可以将两张图像同一坐标系中进行拼接，从而获得精确的距离Δx _p，Δy _p。计算平台通过云台摄像机的串口进行发送控制命令，使摄像机根据计算出的精确距离和方向持续转动，转动时间为一个固定时间窗长T _s；重复该步骤，直到两图像的距离小于设定的阈值R _s时匹配方法结束。 Specifically, the current video frame is close to but not completely matched with the target image, and FLANN cannot complete the pixel-level precise matching function under such circumstances. The computing platform extracts the feature points of the two images. Since the images are close to each other and the matching points are close, the two images can be spliced in the same coordinate system by using the perspective change, so as to obtain the precise distance Δx _p , Δy _p . The computing platform sends control commands through the serial port of the PTZ camera, so that the camera continues to rotate according to the calculated precise distance and direction, and the rotation time is a fixed time window length T _s ; repeat this step until the distance between the two images is less than the set The matching method ends at the threshold R _s .

在103和104中提出的粗略查找和精细查找的两阶段方法中，两个阶段的转动时间窗口T _c，T _s可以进行自定义设定，以达到速度与精准率的平衡。 In the two-stage method of coarse search and fine search proposed in 103 and 104, the rotation time windows T _c and T _s of the two stages can be customized to achieve a balance between speed and accuracy.

步骤103和步骤104中提出的粗略查找和精细查找中，自定义设定转动时间窗口：In the rough search and fine search proposed in step 103 and step 104, the rotation time window is customized:

T _c＝αT (4) _Tc = αT (4)

T _s＝βT (5) T _s =βT (5)

其中，α，β∈[0,1]，参数可以辅助用户灵活调整门限值。当α＝1,β＝0时，此时系统在粗略匹配阶段追求速度，T _c达到最大值，意味着每次旋转时间长，计算次数少，匹配速度快，但是可能出现转向过度等问题，而系统在精确匹配阶段追求准确率，T _s达到最小值，意味着每次旋转时间短，计算次数多，匹配速度慢，但是匹配精准率高，失败率低。其中，需要特别注意的是，当α＝0,β＝0时，T _c和T _c不会直接降为0，此时T _c＝T _c＝T _min，其中T _min为设定的最小转动时间窗口。 Among them, α, β∈[0,1], the parameters can assist users to adjust the threshold flexibly. When α=1, β=0, the system pursues speed in the rough matching stage, and T _c reaches the maximum value, which means that each rotation takes a long time, the number of calculations is small, and the matching speed is fast, but problems such as oversteering may occur. The system pursues the accuracy rate in the precise matching stage, and T _s reaches the minimum value, which means that each rotation time is short, the number of calculations is large, and the matching speed is slow, but the matching accuracy rate is high and the failure rate is low. Among them, it needs special attention that when α=0, β=0, T _c and T _c will not directly drop to 0, at this time T _c =T _c =T _min , where T _min is the set minimum rotation time window.

如图3所示，利用FLANN算法进行目标图像与当前图像第一阶段的匹配，计算得出目标图像的中点

与当前图像的中点

并且进一步计算出中点之间的差值。如果该差值的横纵坐标大于阈值R _c，则需要对摄像机进行调整，方法是当计算平台正常时，通过串口向摄像机发送旋转指令，使其向目标区域旋转，默认系统工作在高精度设置，即α＝0，β＝0；当差值小于阈值R _c时，计算平台通过串口向摄像机发送停止指令；若差值开始便没有超出阈值R _c，则摄像机静止。 As shown in Figure 3, the FLANN algorithm is used to match the target image with the current image in the first stage, and the midpoint of the target image is calculated

Midpoint with the current image

And further calculate the difference between the midpoints. If the horizontal and vertical coordinates of the difference are greater than the threshold R _c , you need to adjust the camera. The method is to send a rotation command to the camera through the serial port when the computing platform is normal, so that it rotates to the target area. The default system works at high-precision settings , ie α=0, β=0; when the difference is less than the threshold R _c , the computing platform sends a stop command to the camera through the serial port; if the difference does not exceed the threshold R _c from the beginning, the camera stops.

如图4所示，在完成第一阶段的匹配后，当前图像与目标图像接近但没有完全精确匹配，在第二阶段中，利用FLANN获得的匹配特征点，构造图像透视变换，以获得两图之间精确的距离Δx _p，Δy _p。旋转的方法与第一阶段一样。最终将摄像头转向至目标图像处，误差不超过阈值R _s。 As shown in Figure 4, after the first stage of matching is completed, the current image is close to the target image but not exactly matched. In the second stage, the matching feature points obtained by FLANN are used to construct image perspective transformation to obtain two images The precise distance between Δx _p , Δy _p . The method of rotation is the same as the first stage. Finally, the camera is turned to the target image, and the error does not exceed the threshold R _s .

如图5、图6和图7所示，在正常天气的校园环境室内，校园室外环境和近似工厂室内环境下分别进行一百次测试，每十次测试后换一个目标图像，测试成功率分别为99％，100％和100％，充分说明了本发明提出的方法在多种情况下以及不同目标图像下能够进行精确匹配，能够极大地提升监控人员的工作效率。As shown in Fig. 5, Fig. 6 and Fig. 7, one hundred tests were carried out in the indoor environment of the campus environment in normal weather, the outdoor environment of the campus and the similar indoor environment of the factory, and the target image was changed after every ten tests. The test success rates were respectively It is 99%, 100% and 100%, which fully demonstrates that the method proposed by the present invention can perform accurate matching under various situations and different target images, and can greatly improve the working efficiency of monitoring personnel.

本说明书中各个实施例采用递进的方式描述，每个实施例重点说明的都是与其他实施例的不同之处，各个实施例之间相同相似部分互相参见即可。Each embodiment in this specification is described in a progressive manner, each embodiment focuses on the difference from other embodiments, and the same and similar parts of each embodiment can be referred to each other.

本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想；同时，对于本领域的一般技术人员，依据本发明的思想，在具体实施方式及应用范围上均会有改变之处。综上所述，本说明书内容不应理解为对本发明的限制。In this paper, specific examples have been used to illustrate the principle and implementation of the present invention. The description of the above embodiments is only used to help understand the method of the present invention and its core idea; meanwhile, for those of ordinary skill in the art, according to the present invention Thoughts, there will be changes in specific implementation methods and application ranges. In summary, the contents of this specification should not be construed as limiting the present invention.

Claims

A control method for improving the positioning accuracy of a pan-tilt camera, characterized in that it comprises the steps:

Step 101, first perform initialization: both the computing platform and the PTZ camera are powered on, and after waiting for the self-test of the computing platform and the PTZ camera to complete, the computing platform calls the network video stream sequence of the PTZ camera frame by frame based on the ffmpeg library; then calculates The platform controls the PTZ camera to collect image information of all application scenarios, and constructs a panorama of the application scenarios based on the image stitching method;

Step 102, the display screen of the computing platform displays a panorama of the application scene, the user selects an area of interest in the panorama of the application scene, and saves the image of the area of interest as a target image, and the user selects an area of interest in a plurality of target images Select a target image as the target to be turned;

Step 103, rough search: the calculation platform receives the target image provided by the user, the FLANN matching algorithm obtains the coordinates of the target image and the current image, and calculates the coordinate difference between the target image and the current image; if the coordinate difference does not meet the threshold condition, then according to the coordinate The difference determines the rotation direction and rotation distance of the PTZ camera, and controls the PTZ camera to rotate according to the obtained rotation direction and rotation distance; if the coordinate difference satisfies the threshold condition, the rotation is not performed; repeat step 103 until the obtained target image The coordinate difference with the current image satisfies the threshold condition;

Step 104. Precise search: the computing platform selects two images that are close to each other and the matching point of the target image, and extracts feature points; use the perspective transformation method to splice the two images in the same coordinate system to obtain an accurate horizontal The coordinate difference Δx _p and the precise ordinate difference Δy _p ; if the abscissa difference and the ordinate difference meet the threshold condition, the computing platform sends a control command through the serial port of the pan-tilt camera, and the pan-tilt camera The distance and direction continue to rotate, and the rotation time is a fixed time window length T _s ; if the difference in abscissa and ordinate does not meet the threshold condition, repeat step 104 until the distance between the two images is less than the set threshold R _s , the image matching is ended; the precise positioning and output of the latitude and longitude of the location of the target to be identified are used for the control circuit to rotate the camera to the target area selected by the user.

The control method for improving the positioning accuracy of the pan-tilt camera according to claim 1, wherein the pan-tilt camera is a camera adapted to a network video protocol; the computing platform and the pan-tilt camera are wired or wirelessly communication.

According to the control method for improving the positioning accuracy of the pan-tilt camera according to claim 1, it is characterized in that, the specific manner in which the computing platform controls the pan-tilt camera to collect the image information of all application scenarios is:

The computing platform sends control rotation commands through the serial port of the PTZ camera, and the camera turns left or right for a fixed time window length T _q and then stops; in different scenarios, the user adjusts the balance factor σ to balance the speed of panorama image generation and The success rate, the balance formula is:

A=1-σ,T _q =σT (1)

Among them, A represents the splicing success rate, T represents a fixed time period, and T _q is the fixed time window length; when σ=1, the image acquisition speed is the fastest and the success rate is the lowest; when σ=0, the image acquisition The slowest speed, the highest success rate;

After a rotation stops, the computing platform extracts the real-time video frame and saves the video frame as a picture; the camera continues to extract the video frame and save it as a picture until all the working scene information is saved to the computing platform by the camera.

According to the control method for improving the positioning accuracy of the pan-tilt camera according to claim 3, it is characterized in that the image stitching method is based on the Stitching method in OpenCV, and the computing platform uses the stored scene information images to stitch to form a panorama of the application scene .

According to the control method of improving the positioning accuracy of the pan-tilt camera according to claim 4, it is characterized in that step 103 specifically comprises the following steps:

Step 103-1. After receiving the target image provided by the user, the computing platform matches the target image with the panoramic image based on the FLANN matching algorithm, and the FLANN matching algorithm returns the coordinates (X, Y) of the target point matching all the features; comparison The coordinates of two adjacent target points, if the distance between the coordinates of two adjacent target points exceeds the length or width of the original image, then redefine the target point as a mismatch point;

Step 103-2. Traverse all matching points and find out the four vertices in the area where the matching points are located: the minimum point on the x-axis and the minimum point on the y-axis

x-axis minimum and y-axis maximum points

x-axis maximum and y-axis minimum point

x-axis maximum and y-axis maximum point

Match the midpoint with the target image

Step 103-4, according to the distance between the current image matching midpoint and the target image matching midpoint in the x-axis direction

and the distance in the y-axis direction

According to the control method for improving the positioning accuracy of the pan-tilt camera according to claim 5, it is characterized in that, in the rough search and the fine search proposed in step 103 and step 104, the custom setting rotation time window:

_Tc = αT (4)

T _s =βT (5)

Among them, α, β∈[0,1] are used to assist users to adjust the threshold value flexibly; when α=1, β=0, it reaches the maximum value, each rotation takes a long time, the number of calculations is small, and the matching speed is fast; When α=0, β=0, T _c =T _s =T _min , where T _min is the set minimum rotation time window.

According to the control method for improving the positioning accuracy of the pan-tilt camera according to claim 5, it is characterized in that, in step 103 and step 104, the computing platform controls the pan-tilt camera to rotate according to the obtained rotation direction and the rotation distance, and the rotation direction has the following 8 situations:

When Δx>R, -R≤Δy≤R, the pan-tilt camera rotates to the right at the speed θ _c ;

When Δx<R, -R≤Δy≤R, the pan-tilt camera rotates to the left at the speed θ _c ;

When -R≤Δx≤R, Δy>R, the pan-tilt camera rotates upward at the speed θ _c ;

When -R≤Δx≤R, Δy<R, the pan-tilt camera rotates downward at the speed θ _c ;

When Δx>R, Δy>R, the pan-tilt camera rotates to the upper right at the speed θ _c ;

When Δx>R, Δy≤-R, the PTZ camera rotates downward and rightward at the speed θ _c ;

When Δx≤-R, Δy>R, the pan-tilt camera rotates to the upper left at the speed θ _c ;

When Δx≤-R, Δy≤-R, the pan-tilt camera rotates to the lower left at the speed θ _c ;

Among them, R>0, R is the preset accuracy threshold R _c or R _s ; in rough search, when the difference between the camera coordinates and the target coordinates is less than or equal to R _c , stop the rough search; in exact match, when When the difference between the coordinates of the camera and the coordinates of the target is less than or equal to R _s , stop the precise matching.

A system for improving the positioning accuracy of the pan-tilt camera according to the control method according to claim 1, characterized in that it comprises:

PTZ camera for rotating camera;

The battery module and the power module are used to supply power to the computing platform and the PTZ camera;

The computing platform is used to communicate with the PTZ camera and control the rotation of the PTZ camera, including:

An information receiving device for calling the network video stream sequence of the PTZ camera frame by frame and receiving the target image provided by the user;

A control device for controlling the pan-tilt camera to collect image information of all application scenarios and controlling the pan-tilt camera to rotate;

A display device for displaying application scenarios and allowing users to select target images;

A data processing device for extracting image features and stitching images, and obtaining the coordinate difference between the target image and the current image;

A data storage device for saving real-time extracted video frames as pictures.