[go: up one dir, main page]

WO2018176929A1 - Procédé et appareil de flou d'arrière-plan d'image - Google Patents

Procédé et appareil de flou d'arrière-plan d'image Download PDF

Info

Publication number
WO2018176929A1
WO2018176929A1 PCT/CN2017/117180 CN2017117180W WO2018176929A1 WO 2018176929 A1 WO2018176929 A1 WO 2018176929A1 CN 2017117180 W CN2017117180 W CN 2017117180W WO 2018176929 A1 WO2018176929 A1 WO 2018176929A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
depth
pixel
reference image
pyramid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2017/117180
Other languages
English (en)
Chinese (zh)
Inventor
宋明黎
李欣
黄一宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of WO2018176929A1 publication Critical patent/WO2018176929A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/04Context-preserving transformations, e.g. by using an importance map
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2621Cameras specially adapted for the electronic generation of special effects during image pickup, e.g. digital cameras, camcorders, video cameras having integrated special effects capability

Definitions

  • the embodiments of the present application relate to the field of image processing technologies, and in particular, to an image background blurring method and apparatus.
  • the background blur of an image refers to the way in which focus is focused on a theme in an image, and the non-theme elements are blurred.
  • the mountain we want to use the mountain as the theme of the whole image, we can focus the camera on the mountain, and the image of the mountain will become clear, and the water will become blurred;
  • an embodiment of the present application provides an image background blurring method and device, so that the mobile terminal can capture an image with a clear foreground and a blurred background.
  • the embodiment of the present application is implemented as follows:
  • an embodiment of the present application provides an image background blurring method, the method comprising: extracting a reference image and m non-reference images in a target video according to an image extraction rule; Constructing a first image pyramid using the reference image, constructing m second image pyramids using m non-reference images; determining a scene depth map of the reference image using the first image pyramid and the m second image pyramids; using the scene depth map to reference the image
  • the pixel points are divided into n depth layers; the target position is determined in the reference image; the target depth layer where the pixel point corresponding to the target position is located is determined from the n depth layers; and the pixel to be processed is subjected to blur processing.
  • the target video is a video captured by the mobile terminal according to a predetermined trajectory
  • the predetermined trajectory may be preset
  • the predetermined trajectory is a moving trajectory on the same plane.
  • the predetermined trajectory may be a left-to-right moving trajectory on the same plane, and the predetermined trajectory may also be a right-to-left moving trajectory on the same plane, and the predetermined trajectory may also be top-to-bottom in the same plane.
  • the moving track, the predetermined track may also be a moving track from bottom to top on the same plane.
  • the image extraction rule is a preset rule, and the image extraction rule may be: selecting a reference image and m non-reference images in the target video according to the playing duration of the target video, where m is a positive integer greater than or equal to 1.
  • the reference image and the non-reference image are images extracted from different moments in the target video, and the reference image is the same as the shooting scene of the non-reference image, but the angle of view of the reference image is different from the position of the non-reference image. of.
  • the mobile terminal uses the reference image as the bottom image of the first image pyramid. Then, the resolution of the underlying image of the first image pyramid is reduced to half as the upper layer image of the underlying image of the first image pyramid, and this step is continuously repeated to continuously obtain the upper layer image of the first image pyramid. Finally, the first image pyramid of a reference image having a different resolution can be obtained by repeating several times.
  • the scene depth map of the reference image represents the relative distance between any pixel point in the reference image and the mobile terminal, and the pixel value of the pixel point in the scene depth map represents the relative distance between the actual location where the pixel point is located and the mobile terminal.
  • the mobile terminal can acquire the preset n and the manner of dividing the depth layer, so that the number of depth layers and the depth range of each depth layer can be known.
  • the target position is determined in the reference image according to the control command.
  • the control instruction may be an instruction input by the user on the touch screen of the mobile terminal by using a finger.
  • the specific position in the reference image is determined as the target position.
  • the specific position in the reference image is a previously specified position.
  • the face image in the reference image is identified, and the position of the face image in the reference image is determined as the target position.
  • the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.
  • the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and determines the target position in the n depth layers by using the determined target position of the reference image.
  • determining the scene depth map of the reference image by using the first image pyramid and the m second image pyramids comprises: determining a reference according to the top image of the first image pyramid and the top image of the m second image pyramids a preliminary depth map of the image, the first image pyramid and the m second image pyramids each including a top image and a lower image; determining a reference according to the preliminary depth map, the lower image of the first image pyramid, and the lower image of the m second image pyramids The depth map of the scene of the image.
  • the reference image at different resolutions is deeply sampled in the first image pyramid and the m second image pyramids, and the high-resolution scene depth map is derived by using the low-resolution preliminary depth map, thereby speeding up the depth recovery
  • the speed of the reference image depth can be generated more quickly by the embodiment of the present application.
  • determining a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids comprises: a top image according to the first image pyramid and m second images The top image of the pyramid is used to calculate the first matching loss body; the Markov random field model is constructed according to the first matching loss body to optimize the global matching loss, and the preliminary depth map of the reference image is obtained.
  • the first matching loss body may be first calculated according to the top image of the first image pyramid and the top image of the m second image pyramids; then, the MRF model is constructed according to the first matching loss body to perform global matching loss optimization, thereby A preliminary depth map of the reference image with a smooth detail.
  • calculating, according to the top image of the first image pyramid and the top image of the m second image pyramids, calculating the first matching loss body includes: obtaining the reference image and the view angle of the m non-reference images a camera external parameter and a camera internal parameter of the mobile terminal; determining a feature point in the reference image according to the feature point extraction rule; acquiring a three-dimensional coordinate of the feature point of the reference image; determining a minimum of the reference image in the scene according to the three-dimensional coordinate of the feature point of the reference image a depth value and a maximum depth value; determining a plurality of depth planes between the minimum depth value and the maximum depth value; using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm to calculate a plurality of depth planes from a plane in which the reference image is located to m a first homography matrix of a plane map in which the non-reference image is located; using a plane scan algorithm and a first homography matrix, each pixel point of
  • the re-projection is used to calculate the matching loss, so that the depth of the camera can be better adapted to the camera pose changes of the reference image and the m non-reference images in the depth recovery, and the reliability of the depth recovery method is improved.
  • determining the plurality of depth planes between the minimum depth value and the maximum depth value comprises: calculating a first depth plane where the minimum depth value is located by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm The second homography matrix of the reference image plane to m non-reference image plane mappings; using the camera internal parameter, the camera external parameter and the direct linear transformation algorithm, the second depth plane where the maximum depth value is calculated is from the reference image plane to m non- a third homography matrix of the reference image plane mapping; projecting a pixel point in the reference image according to the second homography matrix onto a plane where the m non-reference images are located, to obtain a first projection point; A pixel is projected onto a plane on which the m non-reference images are located according to the third homography matrix to obtain a second projection point; and a plurality of samples are uniformly sampled on a line formed between the first projection point and the second projection point. Point; backprojecting a plurality of sampling points
  • the pixel when calculating the matching loss of the pixel of the reference image according to a depth plane, the pixel needs to be re-projected onto the m non-reference image planes, and after the multiple depth planes are re-projected, in the m non-reference images.
  • the positions of the present application are helpful for the subsequent steps to more efficiently extract the pixel matching information between the reference image and the m non-reference images, thereby improving the accuracy of the scene depth map.
  • determining a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids includes: determining a top image of the first image pyramid Pixels corresponding to the pixels of the lower image of the first image pyramid; determining pixel points of the lower image of the m second image pyramids corresponding to the pixels of the top image of the m second image pyramids; determining according to the preliminary depth map An estimated depth value of a pixel point of a lower layer image of the first image pyramid; a minimum depth value and a maximum depth value of a pixel point of the lower layer image of the first image pyramid are determined according to the estimated depth value; determining between the minimum depth value and the maximum depth value a plurality of depth planes of the lower layer image of the first image pyramid; calculating a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids by using the plane scanning algorithm and the plurality of depth
  • the preliminary depth map is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the calculation amount and improving the depth recovery method for image noise.
  • the robustness of the interference is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the calculation amount and improving the depth recovery method for image noise.
  • determining, from the n depth layers, a target depth layer where a pixel point corresponding to the target position is located includes: acquiring a specified pixel point of a target position of the reference image; determining and specifying the pixel in the scene depth map The corresponding pixel value of the point; determining the target depth layer where the specified pixel point is located in the n depth layers according to the pixel value corresponding to the specified pixel point.
  • the mobile terminal After the mobile terminal determines the target location in the reference image, it can directly go to the specified pixel point of the target location, and then determine the pixel value corresponding to the specified pixel point in the scene depth map, and then the pixel value can be known to correspond to the pixel value.
  • the target depth layer in this case, the target depth layer where the pixel point corresponding to the target position is located can be determined in the n depth layers.
  • performing blur processing on the pixel to be processed includes: determining L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; calculating depths of the L depth layers and the target depth layer Poor; the pixel points of each of the L depth layers are subjected to a predetermined ratio of blur processing according to the depth difference, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference.
  • the depth difference between the L depth layers and the target depth layer can be calculated, and then the mobile terminal can each of the L depth layers according to the depth difference.
  • the pixel points of the depth layer are subjected to a preset ratio blurring process.
  • the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference, and if the depth difference between the depth layer and the target depth layer in the L depth layers is larger, then the pixel points in the depth layer The greater the degree of blurring; if the depth difference between the depth layer and the target depth layer in the L depth layers is smaller, the degree of blurring of the pixel points in the depth layer is smaller, thereby reflecting the level of different distances in the reference image sense.
  • an embodiment of the present application provides an image background blurring apparatus, where the apparatus includes: an extracting module, configured to extract a reference image and m non-reference images in a target video according to an image extraction rule, and the target video is utilized.
  • the video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 9;
  • a building module configured to construct a first image pyramid by using a reference image, and construct m second image pyramids by using m non-reference images;
  • a first determining module configured to determine a scene depth map of the reference image by using the first image pyramid and the m second image pyramids, where the scene depth map of the reference image represents a relative distance between any pixel point in the reference image and the mobile terminal;
  • a dividing module configured to divide a pixel point of the reference image into n depth layers by using a scene depth map, wherein a depth of the object corresponding to the pixel point in the different depth layer to the mobile terminal is different, where n is greater than or equal to 2;
  • a second determining module configured to determine a target location in the reference image
  • a third determining module configured to determine, from the n depth layers, a target depth layer where the pixel corresponding to the target location is located;
  • a fuzzy processing module configured to perform blur processing on the pixel to be processed, where the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.
  • the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and determines the target position in the n depth layers by using the determined target position of the reference image.
  • the first determining module is specifically configured to determine a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids, the first image pyramid and the m
  • the second image pyramids each include a top layer image and a lower layer image; and determine a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.
  • the reference image at different resolutions is deeply sampled in the first image pyramid and the m second image pyramids, and the high-resolution scene depth map is derived by using the low-resolution preliminary depth map, thereby speeding up the depth recovery
  • the speed of the reference image depth can be generated more quickly by the embodiment of the present application.
  • the first determining module is configured to calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids; and construct the first matching loss body according to the first matching loss body
  • the Markov random field model performs global matching loss optimization to obtain a preliminary depth map of the reference image.
  • the first matching loss body may be first calculated according to the top image of the first image pyramid and the top image of the m second image pyramids; then, the MRF model is constructed according to the first matching loss body to perform global matching loss optimization, thereby A preliminary depth map of the reference image with a smooth detail.
  • the first determining module is specifically configured to acquire a camera external parameter and a camera internal parameter of the mobile terminal at a viewing angle where the reference image and the m non-reference images are located; and determine the reference image according to the feature point extraction rule.
  • a feature point obtaining a three-dimensional coordinate of the feature point of the reference image; determining a minimum depth value and a maximum depth value in the scene in which the reference image is located according to the three-dimensional coordinate of the feature point of the reference image; determining between the minimum depth value and the maximum depth value a depth plane; using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm to calculate a first homography matrix of a plurality of depth planes from a plane in which the reference image is located to a plane mapping of m non-reference images; using a plane scanning algorithm And the first homography matrix, each pixel of the top image of the first image pyramid is projected in a plurality of depth planes onto a plane on which the top image of the m second image pyramids is located, to obtain a projection of each pixel point Parameter value; parameter value of each pixel point and projection of each pixel point according to the top image of the first image pyramid Parameter values, is determined for each pixel in the mismatching loss on the depth value;
  • the re-projection is used to calculate the matching loss, so that the depth of the camera can be better adapted to the camera pose changes of the reference image and the m non-reference images in the depth recovery, and the reliability of the depth recovery method is improved.
  • the first determining module is specifically configured to calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, the first depth plane where the minimum depth value is located, from the reference image plane to the m non-reference images.
  • the second homography matrix of the plane mapping; using the camera internal parameter, the camera external parameter and the direct linear transformation algorithm, the third homography of the second depth plane where the maximum depth value is located is mapped from the reference image plane to the m non-reference image planes.
  • a matrix of pixels in the reference image is projected onto the plane of the m non-reference images according to the second homography matrix to obtain a first projection point; and one pixel point in the reference image is according to the third homography
  • the matrix is projected onto a plane where m non-reference images are located to obtain a second projection point; uniformly sampling a line between the first projection point and the second projection point to obtain a plurality of sampling points; and reversing the plurality of sampling points Projecting into the three-dimensional space of the viewing angle of the reference image, a plurality of depth planes corresponding to the depth values of the plurality of sampling points are obtained.
  • the pixel when calculating the matching loss of the pixel of the reference image according to a depth plane, the pixel needs to be re-projected onto the m non-reference image planes, and after the multiple depth planes are re-projected, in the m non-reference images.
  • the positions of the present application are helpful for the subsequent steps to more efficiently extract the pixel matching information between the reference image and the m non-reference images, thereby improving the accuracy of the scene depth map.
  • the first determining module is specifically configured to determine a pixel point of the lower layer image of the first image pyramid corresponding to the pixel point of the top image of the first image pyramid; and determine the m second image pyramids a pixel of the lower image of the m second image pyramid corresponding to the pixel of the top image; determining an estimated depth value of the pixel of the lower image of the first image pyramid according to the preliminary depth map; determining the first image pyramid according to the estimated depth value a minimum depth value and a maximum depth value of a pixel of the lower layer image; determining a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value; calculating using a plane scanning algorithm and a plurality of depth planes a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids; the lower layer image of the first image pyramid is used as the guiding image, and the second matching loss body is locally optimized by using the guiding
  • the preliminary depth map is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the calculation amount and improving the depth recovery method for image noise.
  • the robustness of the interference is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the calculation amount and improving the depth recovery method for image noise.
  • the third determining module is specifically configured to acquire a specified pixel point of the target position of the reference image; determine a pixel value corresponding to the specified pixel point in the scene depth map; and correspond to the pixel corresponding to the specified pixel point The value determines the target depth layer at which the specified pixel point is located in the n depth layers.
  • the mobile terminal After the mobile terminal determines the target location in the reference image, it can directly go to the specified pixel point of the target location, and then determine the pixel value corresponding to the specified pixel point in the scene depth map, and then the pixel value can be known to correspond to the pixel value.
  • the target depth layer in this case, the target depth layer where the pixel point corresponding to the target position is located can be determined in the n depth layers.
  • the fuzzy processing module is specifically configured to determine L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; and calculating a depth difference between the L depth layers and the target depth layer; The depth difference performs a predetermined ratio of blurring on the pixel points of each of the L depth layers, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference.
  • the depth difference between the L depth layers and the target depth layer can be calculated, and then the mobile terminal can each of the L depth layers according to the depth difference.
  • the pixel points of the depth layer are subjected to a preset ratio blurring process.
  • the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference, and if the depth difference between the depth layer and the target depth layer in the L depth layers is larger, then the pixel points in the depth layer The greater the degree of blurring; if the depth difference between the depth layer and the target depth layer in the L depth layers is smaller, the degree of blurring of the pixel points in the depth layer is smaller, thereby reflecting the level of different distances in the reference image sense.
  • an embodiment of the present application provides an image background blurring apparatus, where the apparatus includes: a processor and a memory, wherein the memory stores an operation instruction executable by the processor, and the processor reads the operation instruction in the memory.
  • extracting a reference image and m non-reference images in the target video according to an image extraction rule the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 1; constructing the first image by using the reference image Pyramid, constructing m second image pyramids using m non-reference images; determining a scene depth map of the reference image using the first image pyramid and the m second image pyramids, the scene depth map of the reference image representing any pixel point in the reference image
  • the relative distance from the mobile terminal; the pixel depth of the reference image is divided into n depth layers by using the scene depth map, wherein the depth of the object corresponding to the pixel point in the different depth layer is different to the mobile terminal, where n is greater than or
  • the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and determines the target position in the n depth layers by using the determined target position of the reference image.
  • FIG. 1 is a flowchart of an image background blurring method provided by an embodiment of the present application
  • FIG. 2 is a flowchart of another image background blurring method provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • FIG. 4 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • FIG. 5 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an image background blurring apparatus provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of still another image background blurring device provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram showing a design structure of an image background blurring device provided by an embodiment of the present application.
  • FIG. 1 is a flowchart of an image background blurring method provided by an embodiment of the present application.
  • the image background blurring method shown in FIG. 1 can cause the mobile terminal to capture an image with a clear foreground and a blurred background.
  • the method includes the following steps.
  • Step S11 Extracting a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, where m is greater than or equal to 1.
  • the method provided by the embodiment of the present application can be applied to a mobile terminal, and the mobile terminal can be a device such as a smart phone.
  • the target video is a video captured by the mobile terminal according to a predetermined trajectory
  • the predetermined trajectory may be preset
  • the predetermined trajectory is a moving trajectory on the same plane.
  • the predetermined trajectory may be a left-to-right moving trajectory on the same plane, and the predetermined trajectory may also be a right-to-left moving trajectory on the same plane, and the predetermined trajectory may also be up to the same plane.
  • the lower moving track, the predetermined track may also be a moving track from bottom to top on the same plane.
  • the camera of the mobile terminal needs to always be aligned with the position that needs to be taken.
  • the mobile terminal When the target video is captured by the mobile terminal, the user needs to move the mobile terminal in a single direction, slowly and smoothly, and the moving distance can be 20 cm-30 cm. During the movement of the user holding the mobile terminal, the mobile terminal can judge the moving distance according to the gyroscope and select an appropriate reference image and a non-reference image in the target video.
  • the image extraction rule is a preset rule, and the image extraction rule may be: selecting a reference image and m non-reference images in the target video according to the playing duration of the target video, where m is a positive integer greater than or equal to 1. For example, if the length of the target video is 20 seconds, the image extraction rule may be to select 1 reference image and 4 non-reference images in the target video, and determine the image of the 10th second in the target video as the reference image, which will be 1st. Seconds, 3rd, 18th, and 20th seconds are used as non-reference images.
  • the embodiment of the present application does not limit the number of non-reference images.
  • the number of non-reference images may be three, the number of non-reference images may be four, and the number of non-reference images may be five.
  • the reference image and the non-reference image are images extracted from different moments in the target video, and the reference image is the same as the shooting scene of the non-reference image, but the angle of view of the reference image is different from the position of the non-reference image.
  • the user captures a 10-second target video by using the mobile terminal, and the shooting scene of the target video is Plant A and Plant B, for setting the image extraction rule in advance to extract the image of the 5th second in the target video as a reference image.
  • Step S12 constructing a first image pyramid by using a reference image, and constructing m second image pyramids by using m non-reference images.
  • a first image pyramid can be constructed by using one reference image, and m second image pyramids are constructed by using m non-reference images.
  • the "first" and “second” in the first image pyramid and the second image pyramid are only used to distinguish image pyramids constructed from different images, the first image pyramid represents only the image pyramid constructed by the reference image, and the second image pyramid Represents only image pyramids constructed from non-reference images.
  • the mobile terminal uses the reference image as the bottom image of the first image pyramid. Then, the resolution of the underlying image of the first image pyramid is reduced to half as the upper layer image of the underlying image of the first image pyramid, and this step is continuously repeated to continuously obtain the upper layer image of the first image pyramid. Finally, the first image pyramid with a reference image of different resolution can be obtained by repeating several times.
  • the mobile terminal uses the reference image as the third layer image of the first image pyramid, then the first image pyramid
  • the resolution of the third layer image is 1000 ⁇ 1000; then, the resolution of the third layer image of the first image pyramid is reduced to half as the second layer image of the first image pyramid, then the second image pyramid is second
  • the resolution of the layer image is 500 ⁇ 500; finally, the resolution of the second layer image of the first image pyramid is further reduced to half as the third layer image of the first image pyramid, then the third layer image of the first image pyramid
  • the resolution is 250 ⁇ 250.
  • the first image pyramid includes three layers of images, which are reference images with different resolutions, the first layer image is a reference image with a resolution of 250 ⁇ 250, and the second layer image has a resolution of 500 ⁇ 500.
  • the reference image, the third layer image is a reference image with a resolution of 1000 ⁇ 1000.
  • the construction process of the second image pyramid is the same as the construction process of the first image pyramid, and the number of layers of the second image pyramid and the first image pyramid is also the same, and the first image pyramid and the second image may be defined according to actual conditions. The number of layers in the pyramid.
  • Step S13 Determine a scene depth map of the reference image by using the first image pyramid and the m second image pyramids.
  • the scene depth map of the reference image may be determined by using the first image pyramid and the m second image pyramids.
  • the scene depth map of the reference image represents the relative distance between any pixel point in the reference image and the mobile terminal
  • the pixel value of the pixel point in the scene depth map represents the relative distance between the actual location where the pixel point is located and the mobile terminal.
  • the following is a brief description of the example. For example, assuming that the resolution of the reference image is 100 ⁇ 100, the reference image has 10,000 pixels, and after determining the scene depth map of the reference image by using the first image pyramid and the m second image pyramids, the scene depth map The pixel value of 10000 pixels in the middle represents the relative distance between the actual position where 10000 pixels are located and the mobile terminal.
  • Step S14 dividing the pixel points of the reference image into n depth layers by using the scene depth map.
  • the depth of the object corresponding to the pixel in the different depth layer to the mobile terminal is different, where n is greater than or equal to 2, and each depth layer has a depth range.
  • the depth of a certain depth layer may range from 10 meters to 20 degrees. Meter.
  • the n depth layers constitute the scene depth of the reference image, and the scene depth is the distance between the mobile terminal and the position of the farthest pixel point in the reference image.
  • the scene depth may be 0 to 30 meters.
  • the mobile terminal can acquire the preset n and the manner of dividing the depth layer, so that the number of depth layers and the depth range of each depth layer can be known. After the scene depth map of the reference image is obtained, the pixel values of the pixel points in the scene depth map can be determined. Since the pixel value of the pixel point in the scene depth map indicates the relative distance between the actual position where the pixel point is located and the mobile terminal, the mobile terminal may divide each pixel point of the reference image into n according to the pixel value of the pixel point of the scene depth map. In the depth layer.
  • the mobile terminal divides the depth of the reference image into three depth layers according to a preset rule, and the depth of the first depth layer ranges from 0 meters to 10 meters.
  • the second depth layer has a depth ranging from 10 meters to 20 meters
  • the third depth layer has a depth ranging from 20 meters to 30 meters.
  • the pixel A in the reference image is 15 meters relative to the mobile terminal
  • the pixel A is divided into the second depth layer
  • the actual position of the pixel B in the reference image is The relative distance of the mobile terminal is 25 meters
  • the pixel point B is divided into the third depth layer
  • the actual position of the pixel point C in the reference image is 5 meters relative to the mobile terminal, then the pixel point C It will be divided into the first depth layer.
  • Step S15 determining a target position in the reference image.
  • the target position is determined in the reference image according to the control command.
  • the control instruction may be an instruction input by the user on the touch screen of the mobile terminal by using a finger. For example, the user clicks a certain position in the reference image displayed on the touch screen of the mobile terminal by using the finger, and the mobile terminal determines the location clicked by the user as the target position.
  • the specific position in the reference image is determined as the target position.
  • the specific position in the reference image is a previously specified position. For example, by determining the center point of the reference image as a specific position in advance, the mobile terminal can determine the center point of the reference image as the target position. For another example, the location closest to the mobile terminal in the reference image is determined as a specific location in advance, and then the mobile terminal can determine the location closest to the mobile terminal in the reference image as the target location.
  • the face image in the reference image is identified, and the position of the face image in the reference image is determined as the target position. Since the face image in the reference image does not necessarily be in the position of the reference image, the mobile terminal needs to first recognize the face image in the reference image. After the face image in the reference image is recognized, the position where the face image is located is determined, and the position where the face image is located is determined as the target position.
  • Step S16 Determine, from the n depth layers, a target depth layer where the pixel point corresponding to the target position is located.
  • determining, from the n depth layers, the target depth layer where the pixel corresponding to the target location is located may include the following steps: first, acquiring a specified pixel point of the target position of the reference image; and second step, at the scene depth The pixel value corresponding to the specified pixel point is determined in the figure; in the third step, the target depth layer where the specified pixel point is located is determined in the n depth layers according to the pixel value corresponding to the specified pixel point.
  • the mobile terminal After the mobile terminal determines the target location in the reference image, it can directly go to the specified pixel point of the target location, and then determine the pixel value corresponding to the specified pixel point in the scene depth map, and then the pixel value can be known to correspond to the pixel value.
  • the target depth layer in this case, the target depth layer where the pixel point corresponding to the target position is located can be determined in the n depth layers.
  • the mobile terminal divides the depth of the reference image into three depth layers according to a preset rule, and the depth of the first depth layer is 0 meters. Up to 10 meters, the depth of the second depth layer ranges from 10 meters to 20 meters, and the depth of the third depth layer ranges from 20 meters to 30 meters.
  • the target depth layer corresponding to the pixel value of 15 meters is the first The two depth layers, because the pixel value of 15 meters falls within the depth range of the second depth layer by 10 meters to 20 meters, so the target depth layer where the pixel point A is located is the second depth layer.
  • the pixel corresponding to the target depth layer may be a pixel of one object, and the pixel corresponding to the target depth layer may also be a pixel of multiple objects.
  • the object formed by the pixel points corresponding to the target depth layer is only one flower.
  • the object formed by the pixel corresponding to the target depth layer includes a flower and a tree.
  • the object formed by the pixel corresponding to the target depth layer is a part of a tree.
  • the object formed by the pixel corresponding to the target depth layer includes a part of a flower and a part of a tree.
  • Step S17 Perform blur processing on the pixel to be processed.
  • the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.
  • the mobile terminal determines the target depth layer where the pixel corresponding to the target position is located from the n depth layers, it can be known that the pixel points in the target depth layer need to be kept clear, and the n depth layers except the target depth layer
  • the pixels included in the depth layer need to be blurred, and the pixel to be processed is the pixel that needs to be blurred, so the pixel to be processed is blurred.
  • the reference image becomes an image in which the pixel points of the target depth layer are clear and the pixels to be processed are blurred.
  • the pixel to be processed can be blurred by a Gaussian blur algorithm.
  • fuzzy algorithms can also be used for processing.
  • the mobile terminal divides the depth of the reference image into three depth layers according to a preset rule, and the depth of the first depth layer is 0 meters. Up to 10 meters, the depth of the second depth layer ranges from 10 meters to 20 meters, and the depth of the third depth layer ranges from 20 meters to 30 meters.
  • the target depth layer corresponding to the pixel value of 15 meters is the first Two depth layers, so the pixels to be processed contained in the first depth layer and the third depth layer need to be blurred, and the pixels in the second depth layer need to be kept clear.
  • the reference image becomes the pixel of the second depth layer, and the first depth layer and the third layer An image of a depth layer of pixels to be processed that is blurred.
  • step S17 in order to make the pixels to be processed have different degrees of blurring, thereby embodying the layering of the distance in the reference image, the following manner may be implemented. Therefore, the step S17 may further include the following steps: first, determining L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; and in the second step, calculating a depth difference between the L depth layers and the target depth layer; In the third step, the pixel points of each of the L depth layers are subjected to a predetermined ratio of blur processing according to the depth difference, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference .
  • the pixels to be processed are distributed in different depth layers, it is necessary to determine the L depth layers where the pixel points to be processed are located, and then calculate the depth difference between the L depth layers and the target depth layer.
  • the depth difference is the distance between two depth layers, for example, the depth of the first depth layer ranges from 0 meters to 10 meters, and the depth of the second depth layer ranges from 10 meters to 20 meters, and the third depth layer
  • the depth range is 20 meters to 30 meters, then the depth difference between the first depth layer and the second depth layer is 10 meters, and the depth difference between the first depth layer and the third depth layer is 20 meters.
  • the pixel points of each of the L depth layers may be subjected to a predetermined ratio of blur processing according to the depth difference. For example, suppose the first depth layer is the target depth layer, the second depth layer and the third depth layer are the two depth layers where the pixel to be processed is located, and the depth of the first depth layer and the second depth layer The difference is 10 meters, and the difference between the depth of the first depth layer and the third depth layer is 20 meters, then the pixel of the second depth layer is blurred by 25%, and the pixel of the third depth layer is used. The point is blurred by 50%.
  • the depth difference between the L depth layers and the target depth layer can be calculated, and then the mobile terminal can each depth in the L depth layers according to the depth difference.
  • the pixels of the layer are subjected to a preset ratio of blurring.
  • the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference, and if the depth difference between the depth layer and the target depth layer in the L depth layers is larger, then the pixel points in the depth layer The greater the degree of blurring; if the depth difference between the depth layer and the target depth layer in the L depth layers is smaller, the degree of blurring of the pixel points in the depth layer is smaller, thereby reflecting the level of different distances in the reference image sense.
  • the embodiment of the present application divides each pixel of the reference image into n depth layers by using the obtained scene depth map, and then uses the determined target image position of the reference image at n depths.
  • the target depth layer in which the pixel of the target location is located is determined. Therefore, in the embodiment of the present application, the pixel to be processed included in the depth layer other than the target depth layer in the n depth layers may be subjected to blur processing to obtain the target depth.
  • An image in which the pixels of the layer are clear and the pixels to be processed are blurred. Therefore, the embodiment of the present application can cause the mobile terminal to capture an image with a clear foreground and a blurred background.
  • FIG. 2 is a flowchart of another image background blurring method provided by an embodiment of the present application.
  • the embodiment shown in FIG. 2 is an embodiment based on the refinement of step S12 in FIG. 1, so that the same contents as in FIG. 1 can be referred to the embodiment shown in FIG. 1.
  • the method shown in Figure 2 includes the following steps.
  • Step S21 Determine a preliminary depth map of the reference image according to the top image of the first image pyramid and the top image of the m second image pyramids, the first image pyramid and the m second image pyramids each including a top image and a lower layer image.
  • the first layer image of the first image pyramid is referred to as a top layer image
  • the second layer image to the last layer image of the first image pyramid is collectively referred to as a lower layer image
  • the last image of the first image pyramid is The layer image is called the underlying image.
  • the first layer image of the second image pyramid is referred to as the top layer image
  • the second layer image of the second image pyramid is collectively referred to as the lower layer image
  • the last layer image of the second image pyramid is referred to as the bottom layer image.
  • Step S22 Determine a scene depth map of the reference image according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.
  • the reference images at different resolutions are depth-sampled in the first image pyramid and the m second image pyramids, and the high-resolution scene is derived using the low-resolution preliminary depth map.
  • the depth map thereby speeding up the depth recovery, so the embodiment of the present application can use the image pyramid to generate the scene depth map of the reference image more quickly.
  • FIG. 3 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • the embodiment shown in FIG. 3 is based on the refined embodiment of step S21 in FIG. 2, so the same content as FIG. 2 can be seen in the embodiment shown in FIG. 2.
  • the method shown in Figure 3 includes the following steps.
  • Step S31 Calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids.
  • Step S32 constructing a MRF (Markov Random Field) model according to the first matching loss body to perform global matching loss optimization, and obtain a preliminary depth map of the reference image.
  • MRF Markov Random Field
  • the first matching loss body may be first calculated according to the top image of the first image pyramid and the top image of the m second image pyramids; and then the MRF model is constructed according to the first matching loss body.
  • the global matching loss is optimized so that a preliminary depth map of the reference image with a smooth detail can be obtained.
  • FIG. 4 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • the embodiment shown in FIG. 4 is based on the refined embodiment of step S31 in FIG. 3, so the same content as FIG. 3 can be seen in the embodiment shown in FIG.
  • the method shown in Figure 4 includes the following steps.
  • Step S41 Acquire a camera external parameter and a camera internal parameter of the mobile terminal in the perspective of the reference image and the m non-reference images.
  • the mobile terminal can refer to the coordinates of the feature points of the image and the non-reference image, the correspondence relationship of the feature points, and the SFM (Structure from Motion) algorithm to calculate the reference image and the non-reference image.
  • the camera external reference of the corresponding mobile terminal in the perspective the camera external reference of the mobile terminal includes the camera optical center coordinates and the camera optical axis orientation.
  • the camera internal parameters are obtained by pre-calibrating the camera. For example, the mobile terminal can determine the camera internal reference using the camera calibration toolbox through the checkerboard feature.
  • Step S42 Determine feature points in the reference image according to the feature point extraction rule.
  • Step S43 Obtain three-dimensional coordinates of feature points of the reference image.
  • the mobile terminal can perform feature point tracking on the target video by using KLT (Kanade Lucas Tomasi Feature Tracker) algorithm to obtain three-dimensional coordinates of several feature points and several feature points of the reference image.
  • KLT Kanade Lucas Tomasi Feature Tracker
  • Step S44 Determine a minimum depth value and a maximum depth value in the scene where the reference image is located according to the three-dimensional coordinates of the feature points of the reference image.
  • the minimum depth value and the maximum depth value of the feature points in the reference image may be first determined according to the three-dimensional coordinates; then, the depth range formed by the minimum depth value and the maximum depth value of the feature point is expanded by a preset value to obtain a reference.
  • the preset value can be a predetermined empirical value.
  • Step S45 Collect a plurality of depth planes between the minimum depth value and the maximum depth value.
  • the number of depth planes to be collected and the manner in which the depth planes are collected may be preset. For example, 11 depth planes are uniformly collected between the minimum depth value and the maximum depth value.
  • Step S46 Calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, a first homography matrix of a plurality of depth planes from a plane where the reference image is located to a plane where the m non-reference images are located.
  • the number of the first homography matrix is related to the calculation, so a plurality of first homography matrices are obtained here.
  • Step S47 using a Ps (Plane sweep) algorithm and a first homography matrix, projecting each pixel of the top image of the first image pyramid to a top image of the m second image pyramids by using multiple depth planes. On the plane where you are, get the parameter values after each pixel point is projected.
  • Ps Plane sweep
  • the parameter value can be the color and texture of each pixel.
  • Step S48 Determine a matching loss of each pixel point on the depth value according to a parameter value of each pixel point of the top image of the first image pyramid and a parameter value after each pixel point projection.
  • the matching loss can be defined as the absolute difference of the parameter values before and after the re-projection, and the parameter value can be a pixel color gradient.
  • Step S49 determining a matching loss of each pixel point of the top image of the first image pyramid in the plurality of depth planes as the first matching loss body.
  • the conventional method is not used to correct the image before calculating the matching loss, but multiple depth planes are obtained, and the matching loss is calculated by using the re-projection, so that the reference image and m non-can be better adapted in the depth recovery.
  • the reference image changes the camera pose corresponding to the angle of view, improving the reliability of the depth recovery method.
  • FIG. 5 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • the embodiment shown in FIG. 5 is based on the refined embodiment of step S45 in FIG. 4, so the same content as FIG. 4 can be seen in the embodiment shown in FIG.
  • the method shown in Figure 5 includes the following steps.
  • Step S51 using a camera internal parameter, a camera external parameter, and a DLT (Direct Linear Transform) algorithm to calculate a second homography of the first depth plane where the minimum depth value is mapped from the reference image plane to the m non-reference image planes.
  • DLT Direct Linear Transform
  • Step S52 Calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, a third homography matrix of the second depth plane where the maximum depth value is located, from the reference image plane to the m non-reference image planes.
  • the number of the second homography matrix is related to the calculation, so a plurality of second homography matrices are obtained here.
  • Step S53 Projecting a pixel point in the reference image onto the plane where the m non-reference images are located according to the second homography matrix, to obtain a first projection point.
  • Step S54 Projecting one pixel point in the reference image onto the plane where the m non-reference images are located according to the third homography matrix, to obtain a second projection point.
  • Step S55 uniformly sampling a line formed between the first projection point and the second projection point to obtain a plurality of sampling points.
  • Step S56 Backprojecting a plurality of sampling points into a three-dimensional space of a viewing angle of the reference image to obtain a plurality of depth planes corresponding to depth values of the plurality of sampling points.
  • the pixel when calculating the matching loss of the pixels of the reference image according to a depth plane, the pixel needs to be re-projected onto the m non-reference image planes, after the multiple depth planes are re-projected.
  • the positions in the m non-reference images are equally spaced, so the embodiment of the present application helps the subsequent steps to extract the pixel matching information between the reference image and the m non-reference images more efficiently, thereby improving the depth map of the scene. Precision.
  • FIG. 6 is a flowchart of still another image background blurring method provided by an embodiment of the present application.
  • the embodiment shown in FIG. 6 is based on the refined embodiment of step S22 in FIG. 2, so the same content as FIG. 2 can be seen in the embodiment shown in FIG. 2.
  • the method shown in Figure 6 includes the following steps.
  • Step S61 Determine pixel points of the lower layer image of the first image pyramid corresponding to the pixel points of the top image of the first image pyramid.
  • Step S62 Determine pixel points of the lower layer images of the m second image pyramids corresponding to the pixel points of the top image of the m second image pyramids.
  • Step S63 Determine an estimated depth value of a pixel point of the lower layer image of the first image pyramid according to the preliminary depth map.
  • Step S64 Determine a minimum depth value and a maximum depth value of the pixel points of the lower layer image of the first image pyramid according to the estimated depth value.
  • Step S65 Determine a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value.
  • Step S66 Calculate a second matching loss body corresponding to the lower layer image of the first image pyramid and the lower layer image of the m second image pyramids by using the plane scanning algorithm and the plurality of depth planes.
  • Step S67 Using the lower layer image of the first image pyramid as the guide image, locally optimizing the second matching loss body by using a bootstrap filtering algorithm to obtain a third matching loss body.
  • Step S68 Select a depth value with a minimum matching loss in the second matching loss body for each pixel of the lower layer image of the first image pyramid according to the third matching loss body, to obtain a scene depth map of the reference image.
  • the preliminary depth map is used to estimate the minimum depth value and the maximum depth value of the pixel points of the lower layer image of the first image pyramid, thereby determining a relatively small depth search interval, thereby reducing the amount of calculation and The robustness of the depth recovery method to interference such as image noise is improved.
  • FIG. 7 is a schematic diagram of an image background blurring apparatus provided by an embodiment of the present application.
  • FIG. 7 is an embodiment of the apparatus corresponding to FIG. 1.
  • the terminal device includes the following modules:
  • the extraction module 11 is configured to extract a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 9;
  • a building module 12 configured to construct a first image pyramid by using a reference image, and construct m second image pyramids by using m non-reference images;
  • a first determining module 13 configured to determine a scene depth map of the reference image by using the first image pyramid and the m second image pyramids, where the scene depth map of the reference image represents a relative distance between any pixel point and the mobile terminal in the reference image ;
  • the dividing module 14 is configured to divide the pixel points of the reference image into the n depth layers by using the scene depth map, wherein the objects corresponding to the pixel points in the different depth layers are different in depth to the mobile terminal, where n is greater than or equal to 2;
  • a second determining module 15 configured to determine a target location in the reference image
  • the third determining module 16 is configured to determine, from the n depth layers, a target depth layer where the pixel point corresponding to the target location is located;
  • the fuzzy processing module 17 is configured to perform blur processing on the pixel to be processed, where the pixel to be processed is a pixel point included in a depth layer other than the target depth layer among the n depth layers.
  • the first determining module 13 is configured to determine, according to the top image of the first image pyramid and the top image of the m second image pyramids, a preliminary depth map of the reference image, the first image pyramid and the m second image pyramids.
  • Each includes a top layer image and a bottom layer image; and a scene depth map of the reference image is determined according to the preliminary depth map, the lower layer image of the first image pyramid, and the lower layer image of the m second image pyramids.
  • the first determining module 13 is configured to calculate a first matching loss body according to the top image of the first image pyramid and the top image of the m second image pyramids; and construct a Markov according to the first matching loss body
  • the airport model performs global matching loss optimization to obtain a preliminary depth map of the reference image.
  • the first determining module 13 is configured to acquire a camera external parameter and a camera internal parameter of the mobile terminal at a viewing angle where the reference image and the m non-reference images are located; and determine a feature point in the reference image according to the feature point extraction rule; Obtaining a three-dimensional coordinate of the feature point of the reference image; determining a minimum depth value and a maximum depth value in the scene where the reference image is located according to the three-dimensional coordinates of the feature point of the reference image; determining a plurality of depth planes between the minimum depth value and the maximum depth value; Using a camera internal parameter, a camera external parameter and a direct linear transformation algorithm, calculating a first homography matrix of a plurality of depth planes from a plane in which the reference image is located to a plane in which the m non-reference images are located; using the plane scanning algorithm and the first single a pixel matrix, each pixel of the top image of the first image pyramid is projected in a plurality of depth planes onto a plane on which
  • the first determining module 13 is configured to calculate, by using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm, a first depth plane where the minimum depth value is located, and a mapping from the reference image plane to the m non-reference image planes.
  • a second homography matrix using a camera internal parameter, a camera external parameter, and a direct linear transformation algorithm to calculate a third homography matrix in which the second depth plane where the maximum depth value is located is mapped from the reference image plane to m non-reference image planes; A pixel in the reference image is projected onto the plane of the m non-reference images according to the second homography matrix to obtain a first projection point; and one pixel point in the reference image is projected to the third homography matrix to m a second projection point is obtained on a plane where the non-reference image is located; a plurality of sampling points are uniformly sampled on a line formed between the first projection point and the second projection point; and the plurality of sampling points are back-projected to the reference image In the three-dimensional space of the viewing angle, a plurality of depth planes corresponding to the depth values of the plurality of sampling points are obtained.
  • the first determining module 13 is specifically configured to determine a pixel point of the lower layer image of the first image pyramid corresponding to the pixel point of the top image of the first image pyramid; and determine the top image of the m second image pyramid a pixel point of the lower layer image of the m second image pyramids corresponding to the pixel; determining an estimated depth value of the pixel point of the lower layer image of the first image pyramid according to the preliminary depth map; determining the lower layer image of the first image pyramid according to the estimated depth value a minimum depth value and a maximum depth value of the pixel; determining a plurality of depth planes of the lower layer image of the first image pyramid between the minimum depth value and the maximum depth value; calculating the first image pyramid using the plane scanning algorithm and the plurality of depth planes
  • the second matching loss body corresponding to the lower layer image of the m second image pyramids; the lower layer image of the first image pyramid is used as the guiding image, and the second matching loss body is locally optimized by the guiding filtering algorithm to obtain the third matching Loss
  • the third determining module 16 is specifically configured to acquire a specified pixel point of the target position of the reference image; determine a pixel value corresponding to the specified pixel point in the scene depth map; and obtain n pixel values according to the specified pixel point Determine the target depth layer where the specified pixel is located in the depth layer.
  • the blur processing module 17 is specifically configured to determine L depth layers where the pixel to be processed is located, L is greater than or equal to 2 and less than n; and calculating a depth difference between the L depth layers and the target depth layer; The pixel points of each depth layer in the depth layer are subjected to a preset ratio of blurring processing, and the degree of blur of the pixel points of each of the L depth layers is proportional to the depth difference.
  • FIG. 8 is a schematic diagram of still another image background blurring device provided by an embodiment of the present application.
  • the apparatus includes: a processor 21 and a memory 22, wherein the memory 22 stores operation instructions executable by the processor 21, and the processor 21 reads operation instructions in the memory 22 for implementing the above method embodiments. The method in .
  • FIG. 9 is a schematic diagram showing a design structure of an image background blurring device provided by an embodiment of the present application.
  • the image background blurring device includes a transmitter 1101, a receiver 1102, a controller/processor 1103, a memory 1104, and a modem processor 1105.
  • Transmitter 1101 conditions (e.g., analog transforms, filters, amplifies, and upconverts, etc.) the output samples and generates an uplink signal that is transmitted to the base station via the antenna.
  • the antenna receives the downlink signal transmitted by the base station.
  • Receiver 1102 conditions (eg, filters, amplifies, downconverts, digitizes, etc.) the signals received from the antenna and provides input samples.
  • encoder 1106 receives the traffic data and signaling messages to be transmitted on the uplink and processes (e.g., formats, codes, and interleaves) the traffic data and signaling messages.
  • Modulator 1107 further processes (e.g., symbol maps and modulates) the encoded traffic data and signaling messages and provides output samples.
  • Demodulator 1109 processes (e.g., demodulates) the input samples and provides symbol estimates.
  • the decoder 1108 processes (e.g., deinterleaves and decodes) the symbol estimate and provides decoded data and signaling messages that are sent to the terminal.
  • Encoder 1106, modulator 1107, demodulator 1109, and decoder 1108 may be implemented by a composite modem processor 1105. These units are processed according to the radio access technology employed by the radio access network (e.g., access technologies of LTE and other evolved systems).
  • the controller/processor 1103 is configured to extract a reference image and m non-reference images in the target video according to an image extraction rule, where the target video is a video captured by the mobile terminal according to a predetermined trajectory, m is greater than or equal to 1; constructing with the reference image a first image pyramid, m first image pyramids are constructed using m non-reference images; a scene depth map of the reference image is determined using the first image pyramid and the m second image pyramids, and the scene depth map of the reference image represents the reference image
  • the relative distance between the arbitrary pixel and the mobile terminal; the pixel of the reference image is divided into n depth layers by using the scene depth map, wherein the depth of the object corresponding to the pixel in the different depth layer is different to the mobile terminal, where n
  • the target position is determined in the reference image; the target depth layer where the pixel corresponding to the target position is located is determined from the n depth layers; the pixel to be processed is blurred, and the pixel to be processed is n
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of cells is only a logical function division.
  • multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

Selon certains modes de réalisation, la présente invention concerne un procédé et un appareil de flou d'arrière-plan d'image. Le procédé consiste à : extraire une image de référence et m images de non-référence à partir d'une vidéo cible selon une règle d'extraction d'image ; construire une première pyramide d'image à l'aide de l'image de référence, et construire m secondes pyramides d'image à l'aide des m images de non-référence ; déterminer une carte de profondeur de scène de l'image de référence à l'aide de la première pyramide d'image et des m secondes pyramides d'image ; diviser des points de pixel de l'image de référence en n couches de profondeur au moyen de la carte de profondeur de scène ; déterminer des positions cibles dans l'image de référence ; déterminer, à partir des n couches de profondeur, une couche de profondeur cible au niveau de laquelle des points de pixel correspondant aux positions cibles sont situés ; et flouter des pixels à traiter. Dans les modes de réalisation de la présente invention, des pixels à traiter compris dans une couche de profondeur de n couches de profondeur autres qu'une couche de profondeur cible peuvent être floutés, de façon à obtenir une image dans laquelle des pixels de la couche de profondeur cible sont transparents et les pixels à traiter sont flous.
PCT/CN2017/117180 2017-03-27 2017-12-19 Procédé et appareil de flou d'arrière-plan d'image Ceased WO2018176929A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710189167.0 2017-03-27
CN201710189167.0A CN108668069B (zh) 2017-03-27 2017-03-27 一种图像背景虚化方法及装置

Publications (1)

Publication Number Publication Date
WO2018176929A1 true WO2018176929A1 (fr) 2018-10-04

Family

ID=63674131

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/117180 Ceased WO2018176929A1 (fr) 2017-03-27 2017-12-19 Procédé et appareil de flou d'arrière-plan d'image

Country Status (2)

Country Link
CN (1) CN108668069B (fr)
WO (1) WO2018176929A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110120009A (zh) * 2019-05-09 2019-08-13 西北工业大学 基于显著物体检测和深度估计算法的背景虚化实现方法
CN110910304A (zh) * 2019-11-08 2020-03-24 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及介质
CN111222514A (zh) * 2019-12-31 2020-06-02 西安航天华迅科技有限公司 一种基于视觉定位的局部地图优化方法
CN114757860A (zh) * 2022-03-23 2022-07-15 江西师范大学 基于自监督多尺度金字塔融合网络的图像散景虚化方法
CN116703774A (zh) * 2023-06-20 2023-09-05 中国科学院软件研究所 基于三维点云匹配与真实场景重建的视频去雾方法、装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992412B (zh) * 2019-12-09 2023-02-28 Oppo广东移动通信有限公司 图像处理方法、装置、存储介质及电子设备
CN112948814A (zh) * 2021-03-19 2021-06-11 合肥京东方光电科技有限公司 一种账号密码的管理方法、装置及存储介质
CN115760986B (zh) * 2022-11-30 2023-07-25 北京中环高科环境治理有限公司 基于神经网络模型的图像处理方法及装置

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156997A (zh) * 2010-01-19 2011-08-17 索尼公司 图像处理设备和图像处理方法
US8284258B1 (en) * 2008-09-18 2012-10-09 Grandeye, Ltd. Unusual event detection in wide-angle video (based on moving object trajectories)
CN102801910A (zh) * 2011-05-27 2012-11-28 三洋电机株式会社 摄像装置
CN103037075A (zh) * 2011-10-07 2013-04-10 Lg电子株式会社 移动终端及其离焦图像生成方法
CN104424640A (zh) * 2013-09-06 2015-03-18 格科微电子(上海)有限公司 对图像进行虚化处理的方法和装置
CN105578026A (zh) * 2015-07-10 2016-05-11 宇龙计算机通信科技(深圳)有限公司 一种拍摄方法及用户终端
CN106060423A (zh) * 2016-06-02 2016-10-26 广东欧珀移动通信有限公司 虚化照片生成方法、装置和移动终端
CN106331492A (zh) * 2016-08-29 2017-01-11 广东欧珀移动通信有限公司 一种图像处理方法及终端
CN106530241A (zh) * 2016-10-31 2017-03-22 努比亚技术有限公司 一种图像虚化处理方法和装置

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8284258B1 (en) * 2008-09-18 2012-10-09 Grandeye, Ltd. Unusual event detection in wide-angle video (based on moving object trajectories)
CN102156997A (zh) * 2010-01-19 2011-08-17 索尼公司 图像处理设备和图像处理方法
CN102801910A (zh) * 2011-05-27 2012-11-28 三洋电机株式会社 摄像装置
CN103037075A (zh) * 2011-10-07 2013-04-10 Lg电子株式会社 移动终端及其离焦图像生成方法
CN104424640A (zh) * 2013-09-06 2015-03-18 格科微电子(上海)有限公司 对图像进行虚化处理的方法和装置
CN105578026A (zh) * 2015-07-10 2016-05-11 宇龙计算机通信科技(深圳)有限公司 一种拍摄方法及用户终端
CN106060423A (zh) * 2016-06-02 2016-10-26 广东欧珀移动通信有限公司 虚化照片生成方法、装置和移动终端
CN106331492A (zh) * 2016-08-29 2017-01-11 广东欧珀移动通信有限公司 一种图像处理方法及终端
CN106530241A (zh) * 2016-10-31 2017-03-22 努比亚技术有限公司 一种图像虚化处理方法和装置

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110120009A (zh) * 2019-05-09 2019-08-13 西北工业大学 基于显著物体检测和深度估计算法的背景虚化实现方法
CN110120009B (zh) * 2019-05-09 2022-06-07 西北工业大学 基于显著物体检测和深度估计算法的背景虚化实现方法
CN110910304A (zh) * 2019-11-08 2020-03-24 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及介质
CN110910304B (zh) * 2019-11-08 2023-12-22 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及介质
CN111222514A (zh) * 2019-12-31 2020-06-02 西安航天华迅科技有限公司 一种基于视觉定位的局部地图优化方法
CN111222514B (zh) * 2019-12-31 2023-06-27 上海星思半导体有限责任公司 一种基于视觉定位的局部地图优化方法
CN114757860A (zh) * 2022-03-23 2022-07-15 江西师范大学 基于自监督多尺度金字塔融合网络的图像散景虚化方法
CN116703774A (zh) * 2023-06-20 2023-09-05 中国科学院软件研究所 基于三维点云匹配与真实场景重建的视频去雾方法、装置

Also Published As

Publication number Publication date
CN108668069A (zh) 2018-10-16
CN108668069B (zh) 2020-04-14

Similar Documents

Publication Publication Date Title
WO2018176929A1 (fr) Procédé et appareil de flou d'arrière-plan d'image
US12182938B2 (en) System and method for virtual modeling of indoor scenes from imagery
US12272020B2 (en) Method and system for image generation
US11816810B2 (en) 3-D reconstruction using augmented reality frameworks
US11394898B2 (en) Augmented reality self-portraits
CN105283905B (zh) 使用点和线特征的稳健跟踪
US9635251B2 (en) Visual tracking using panoramas on mobile devices
CN108492316A (zh) 一种终端的定位方法和装置
CN108986161A (zh) 一种三维空间坐标估计方法、装置、终端和存储介质
CN106875431B (zh) 具有移动预测的图像追踪方法及扩增实境实现方法
CN109887003A (zh) 一种用于进行三维跟踪初始化的方法与设备
US20050265453A1 (en) Image processing apparatus and method, recording medium, and program
US10545215B2 (en) 4D camera tracking and optical stabilization
CN111192308B (zh) 图像处理方法及装置、电子设备和计算机存储介质
CN112927251B (zh) 基于形态学的场景稠密深度图获取方法、系统及装置
CN108961182B (zh) 针对视频图像的竖直方向灭点检测方法及视频扭正方法
US12282992B2 (en) Machine learning based controllable animation of still images
CN103617631B (zh) 一种基于中心检测的跟踪方法
CN115941924A (zh) 纠偏方法、装置、电子设备以及存储介质
CN117057086A (zh) 基于目标识别与模型匹配的三维重建方法、装置及设备
US11315346B2 (en) Method for producing augmented reality image
US20250173883A1 (en) Real-time, high-quailty, and spatiotemporally consistent depth estimation from two-dimensional, color images
CN115019298B (zh) 一种基于rgb单图的三维物体框的检测方法、装置以及设备
US20230394749A1 (en) Lighting model
EP3594900A1 (fr) Suivi d'un objet dans une séquence d'images panoramiques

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17902751

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17902751

Country of ref document: EP

Kind code of ref document: A1