Embodiment:
The method that the present invention is used to fast and automatically extract rectangular scanning file from digital photograph is suitable for general digital
Camera or mobile phone etc., the image acquired under the conditions of all kinds of natural environments (illumination).Key step includes 1) data it is down-sampled and
Greyscale transformation;2) image filter is made an uproar and edge detection;3) contour tracing detects largest contours line;4) profile simplification and convex closure angle point
Detection;5) region is cut out and perspective distortion is corrected.Fig. 1 provides the key step and process flow of this method:
1) data are down-sampled and greyscale transformation:
It is rich in color since digital photograph (digital picture) data volume is big (sampled point is more, typically larger than four mega pixels)
(usually 24 three colour imagings), need to pre-process data, to promote calculated performance.The present invention is down-sampled, uses
The mode of scaled down, that is, keep the length-width ratio of image.Scaling ruler is typically chosen according to the size of input picture
Between 1/6~1/2 (input picture is bigger, and scale bar s is smaller).The methods of bilinearity, cubic convolution can be used in sampling process;It examines
Consider computational efficiency, also can use nearest point methods (nearest neighbor) and complete down-sampled process, it may be assumed that
Xdst=s Xsrc
Wherein, s is zoom factor,(image space), XsrcIndicate source pixel, XdstIndicate object pixel.
Greyscale transformation is then by image by tri- color space transformation of RGB to brightness space Y:
Y←0.299·R+0.587·G+0.114·B
Wherein, RGB is three color components of image pixel, red (red), green (green), blue (blue).
In view of image capture environment (image-forming condition) is complicated (illumination and reflection environment etc.), to promote the method for the present invention
Extensive adaptability carries out gray scale to image and draws high, with the robustness of method for improving and the accuracy of processing result.Greyscale transformation is adopted
With linear transformation, the actual tonal range of image is mapped to standard grayscale range [0.0,1.0] or [0,255].
Transformation for mula is as follows: Ydst=α * Ysrc+ β,
Wherein, the ÷ of α=255 (max (Ysrc)–min(Ysrc)), β=0- α * min (Ysrc)) (note: if tonal range is
1.0) 255 in formula are then revised as by [0.0,1.0],
YsrcIndicate source pixel gray value, YdstIndicate the gray value of object pixel.
2) image filter is made an uproar and edge detection:
The main purpose that image filter is made an uproar is to improve the accuracy of edge detection.It is flat to filter adoptable all kinds of common images
Sliding algorithm, such as mean filter, median filtering.It is briefly explained by taking 5 × 5 Gaussian kernels (σ=1) as an example below.According to Gaussian kernel letter
Number can calculate convolution mask (integer value):
Convolution is carried out to the image after greyscale transformation with this template, can preferably be retained while filtering out high-frequency noise
Edge feature.
The normal information such as the first derivative based on image, second dervative or gradient of edge detection.The present invention uses Canny operator
Edge is detected, considers that image passes through greyscale transformation (stretching), lower threshold value is respectively set to the 3/8 of tonal range thereon, and 6/8 (to 8
Position gray level image, value 96,192).
3) contour tracing and largest contours line drawing:
The marginal information detected in previous step contains noise, and connectivity is relatively poor (discrete short edge),
Before carrying out contour tracing, refine.Its processing step is first to carry out morphological dilations, connects discontinuous neighbour
Edge fit edge, then carry out etching operation, edge is refined (using 3 × 3 square structure elements in the present invention, it is possible to use its
His structural element).
Frontier tracing eight neighborhood pixel-based, since outermost edge starting point, successively by (side counterclockwise in its neighborhood
To) abutment points be added to tracked marginal point and concentrate, iteration proceeds to until not new marginal point is added.If institute with
The edge length (as unit of pixel) of track be greater than whole image perimeter (that is, 2 times of image it is wide high and) half, then stop searching
Other marginal informations of rope, that is, in this, as the candidate rectangle contour line for tracking (detection).
4) profile is simplified and convex closure angle point calculates:
After extracting initial profile boundary, regularization or simplification are needed, to indicate true profile.Ramer-
Douglas-Peucker algorithm or Douglas-Pecuker algorithm (RDP/DP), are common polyline reduction algorithm, mistake
Journey is with the distance of given point to line (threshold value, this method in be set as the 1~4% of broken line length), and recurrence simplification is apart from threshold
Point (Fig. 2) in value.
Profile angle point set after simplified after calculating its convex closure, selects four points closest to right angle as candidate regions
The rectangle angle point in domain.By taking the scanning method of 2D plane point set as an example, illustrate convex closure calculating process: (right side) point makees starting point under selection most
(that is, y-coordinate smallest point;There is y-coordinate minimum value if there is multiple points, then select the point wherein with maximum x coordinate);It penetrates
Line (direction (1,0) or x-axis positive) rotates counterclockwise around starting point, select on the smallest ray of scan angle point (if there is
It is multiple, then select the point nearest apart from starting point) as next convex closure vertex;Using the point being newly added as rotation center, will scan
Vector replaces the vector between the point and former point;Above step is repeated until returning to starting point.
(note: the time complexity of above-mentioned algorithm is O (n2), more efficient O (n log n) algorithm can also be used such as,
Graham, Chan algorithm etc.)
If above-mentioned convex closure is more than four vertex, the angle point wherein closest to 90 ° is selected, four as region to be transformed
The vertex of the same name of side shape (rectangle).
5) region is cut out and perspective distortion is corrected:
After selecting region vertex, need to calculate the vertex of the same name of corresponding rectangle.It sorts first to region vertex, it is specific to arrange
Program process are as follows: select most lower-left point (or, point with min coordinates and (x+y)) as starting point P0, others point is then by the inverse time
Needle (or clockwise) direction sequencing is P1, P2, P3.Corresponding rectangle apex coordinate be respectively (counter clockwise direction): (0,0), (w,
0), (w, h), (0, h), in which:
W=max (| P0P1|,|P2P3|)
H=max (| P0P3|,|P2P1|)
Wherein, | | indicate line segment length (if up time needle sort, the adjustment of rectangle vertex correspondence).
According to the corresponding relationship on candidate region vertex and rectangle vertex, transformation relation (3 × 3 matrix) can be calculated:
It is denoted as: A3×3X3×4=U3×4, then, and A X XT=U XT, that is, have, A=U XT(X XT)-。
After calculating transformation matrix, selected candidate region can be mapped to corresponding rectangular area, in original image
Perspective distortion, be able in mapping process correct (Fig. 3).(note: including rotation, ratio, translation in above-mentioned transformation matrix,
And mistake such as cuts, has an X-rayed at the transformation)
The fast automatic method for extracting rectangular scanning part in slave general digital photo of the invention, to photo environment, shooting
Condition is not necessarily to manual intervention without particular/special requirement, treatment process, and antinoise (robust), calculating speed is fast, high-efficient, and treated
Digital archive part occupies little space, convenient for preservation and information exchange.Using the present invention as the system of core, it can be used for fast automatic place
The scan image for managing the equipment such as ordinary digital camera photo or scanner, can also be directly mounted at the smart machines such as mobile phone, real
When handle document information.
It is illustrated below using automatically extracting rectangular scanning part from the photo that certain mobile phone is shot as embodiment.Photograph taking
In indoor environment, illumination is insufficient (whole image is obviously partially dark).Image size is 2592 × 1944 pixels, and 24 RGB tri- are colored,
With the storage of jpeg format, occupied space is 1.3 Mbytes.Target area is rectangle paper document, occupies entire imaging region (number
Word photo) major part, relative to image direction (coordinate system), there are rotation and perspective distortions.Data are illustrated such as Fig. 4 institute
Show.
Data are carried out down-sampled rear (scale bar 1/4) using method of the invention, candidate image size is 648 × 496
Pixel is converted to 8 gray level images, wherein gray scale maximum value be 122, minimum value be 7 (brightness is obviously relatively low), need to its into
Row greyscale transformation, to promote the accuracy of edge detection.That is, calculating the coefficient in greyscale transformation formula, Ydst=α * Ysrc+ β,
α=2.21739, β=- 15.52174.Greyscale transformation result is as shown in Figure 5.
Gauss filter is carried out to gray level image to make an uproar, and using Canny operator detection image edge, then carries out utilizing morphological operator
After carrying out refined processing (that is, first expanded with 3 × 3 square structure elements, corrode again) to extracted edge, for extracting equivalence
Line (Fig. 6).
After selecting largest contours line (outermost profile), it is simplified using RDP algorithm, then calculate its convex closure, and selected most
Close to the angle point of four convex closure vertex as candidate region at right angle, respective coordinates are (clockwise): (49,123), (104,
604), (484,543), (393,77);Map that target rectangle region: (0,0), (0,384), (484,384), (484,
0).After (equal proportion amplification) calculates transformation matrix (comprising translation, rotation and perspective etc.), the candidate region application of original image is become
It changes, automatic shearing cuts down and merge mapping (rotation and perspective correction) to target (rectangle) region.Image after processing only includes body paper matter
File region, size are 1936 × 1536 pixels, about the 69% of original image size;Jpeg format (relevant parameter and original image
It is identical) file stores the space occupied about 0.7Mb, than the memory space that original image saves about 45%, and treated image (rectangle
Paper document) horizontal or vertical direction it is more neat, meet the visual custom (Fig. 7) of people.
In addition to the implementation, the present invention can also have other embodiments.It is all to use equivalent substitution or equivalent transformation shape
At technical solution, fall within the scope of protection required by the present invention.