WO2013079602A1 - Spatio-temporal disparity-map smoothing by joint multilateral filtering - Google Patents
Spatio-temporal disparity-map smoothing by joint multilateral filtering Download PDFInfo
- Publication number
- WO2013079602A1 WO2013079602A1 PCT/EP2012/073979 EP2012073979W WO2013079602A1 WO 2013079602 A1 WO2013079602 A1 WO 2013079602A1 EP 2012073979 W EP2012073979 W EP 2012073979W WO 2013079602 A1 WO2013079602 A1 WO 2013079602A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- filter
- disparity map
- filtering
- section
- contemplated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/70—Denoising; Smoothing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/10—Image enhancement or restoration using non-spatial domain filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/20—Image enhancement or restoration using local operators
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/128—Adjusting depth or disparity
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20024—Filtering details
- G06T2207/20032—Median filtering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20172—Image enhancement details
- G06T2207/20182—Noise reduction or smoothing in the temporal domain; Spatio-temporal filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- One common approach is to use joint-bilateral filters.
- the idea is to calculate the filter- coefficients of a bilateral filter [4] scene-adaptively by using the color information of the original images and to apply the adaptive filter to the disparity maps.
- DIBR Depth- Image Based Rendering
- Embodiments of the present invention provide a filter structure for filtering a disparity map.
- the filter structure comprises a first filter and a second filter.
- the first filter is provided for filtering a contemplated section of the disparity map according to a first measure of central tendency.
- the second filter is for filtering the contemplated section of the dispar- ity maps according to a second measure of central tendency.
- the filter structure further comprises a filter selector for selecting the first filter or the second filter for filtering the contemplated section of the disparity map. The selection done by the filter selector is based on at least one local property of the contemplated section.
- Further embodiments provide a method for filtering a disparity map.
- the method comprises determining a local property of a contemplated section of the disparity map for the purpose of filtering.
- a first filter or a second filter is then selected for filtering the contemplated section, the selection being based on the at least one determined local property of the contemplated section.
- the method further comprises filtering the contemplated section of the disparity map using the first filter or the second filter depending on a result of the selection of the first filter or the second filter.
- the present invention reduces visually annoying disturbances and artifacts, that are created by applying Depth-Image Based Rendering (DIBR) to the original stereo images, by intro-ducing a new joint multi-lateral filter to improve previously estimated disparity maps before using them for DIBR.
- DIBR Depth-Image Based Rendering
- the improvement refers to all three properties that are important for pleasant rendering results: spatial smoothness, temporal consistency and exact alignment of depth discontinuities to object borders. It can be applied to any kind of disparity map independently of the method by which they have been created.
- Fig. 1 shows an example of some motion compensated images (MCI's) in a symmetric cluster around reference frame to;
- Fig. 2 shows a schematic block diagram of a filter structure according to embodiments
- Fig. 3 shows a schematic flow diagram of a method for filtering a disparity map according to embodiments
- Fig. 4 schematically illustrates an effect of scene-adaptive switching between weighted median and average filters
- Fig. 5 shows a performance comparison between distance/color kernel of conventional cross-bilateral filters and the invented filter method using the new distance function
- Fig. 6 illustrates an improvement achieved by the introduction of the new confidence kernel.
- the invention uses the following new multi-lateral filter structure instead of conventional bilateral filters.
- the multi-lateral filter structure is based on spatio-temporal processing using motion compensation.
- it can also be applied to an asymmetric cluster in the temporal interval [to - r t : to] using current and previous frames only.
- a combination of LT-Tracking (Kanade- Lucas-Tomasi (feature) Tracking) and simple frame differencing is used to create motion compensated images (MCI).
- MCI motion compensated images
- any other method of motion compensation can be taken as well.
- corresponding pixels are at same position p in all MCI's as they are in current reference frame to.
- Figure 1 shows an example of some MCI's in a symmetric cluster around reference frame to.
- Non-RMC pixels are marked with white color in Figure 1.
- RMC reliably motion compensated
- weighted median filters preserve depth discontinuities better than weighted average filters at object borders.
- weighted average filter clearly outperform weighted median filters.
- the underlying invention uses the following multi-lateral filter structure: with
- w(p, t) weight Disl (p, p 0 , t) - weight conf (p, t) ⁇ weight temp (p, t)
- the filter structure is based on a 3 -dimensional spatio-temporal window n around a center pixel position po and the reference frame to.
- p and t denote pixels and frames within window n.
- the filter structure is applied to motion com- pensated disparity values D mc (p,t).
- D mc (p,t) motion compensation and Reliably Motion Compensated (RMC) Pixel
- RMC Motion Compensated
- the weighted averaging filter is an operation where the motion compensated disparity values D mc (p,t) are multiplied by adaptive weighting factors w(p,t) before calculating the average score.
- the weighted median the frequencies of the motion com- pensated disparity values D mc (p,t) are multiplied by weighting factors w(p,t) before calculating the median score.
- the weighting factors have to be normalized such that their sum over the active window space equals to a pre-defined constant value.
- RMC pixel the central pixel o is labeled as RMC pixel (see section "Motion Compensation and Reliably Motion Compensated Pixel")
- o or a depth discontinuity is detected inside the filter window at reference frame to (e.g., maximal gradient of initial disparity values D mc (p,to) exceeds a predefined threshold Thres_DepthDisc)
- o or color values of original images differ significantly within the filter window at reference frame t (e.g., variance of color samples within the window exceeds a predefined threshold Thres_Var)
- FIG. 2 shows a schematic block diagram of a filter structure 20 according to embodiments of the teachings disclosed herein.
- the filter structure 20 for filtering a disparity map D(p, to) of a temporal sequence of disparity maps D(p, t) comprises a first filter 24 for filtering a contemplated section 12 (for example: pixel po, or a group of pixels) of the disparity map D(p, t 0 ) according to a first measure of central tendency; a second filter 26 for filtering the contemplated section 12 (e.g., p 0 ) of the disparity maps D(p, to) according to a second measure of central tendency; and a filter selector 22 for selecting the first filter 24 or the second filter 26 for filtering the contemplated section 12 of the disparity map D(p, t 0 ), the selection being based on at least one local property of the contemplated section 12.
- the filter structure 20 is configured to output a filtered disparity map Do(p, to) (i.e., the filtered disparity values) comprising a filtered contemplated section 92.
- the first filter 24 may be a median filter and filter the contemplated section 12 according to a median filtering scheme.
- the second filter 26 may be an average filter and configured to filter the contemplated section 12 according to an average filtering scheme.
- the at least one local property may control a binary mask Mask(p, to) for the disparity map indicating to the filter selector 22 whether the first filter 24 or the second filter 26 is to be used for filtering the contemplated section.
- the at least one local property may comprise at least one of
- the contemplated section 12 being labeled as reliably motion compensated (RMC) by a motion compensation unit upstream of the filter structure;
- the first filter 24 may be a weighted first filter and the second filter 26 may be a weighted second filter.
- a weighting performed by the weighted first filter 24 or the weighted second filter 26 may be based on at least one of
- the distance measure may be determined on the basis of a sum of color differences along a path from the contemplated section 12 (e.g. p 0 ) to the further section.
- a filter window (n) may be associated to the contemplated section 12 (e.g. po) of the disparity map D(p, to), the filter window (n) being a 3 -dimensional spatio-temporal window and defining a spatial extension and a temporal extension of filtering actions performed by the first filter 24 and the second filter 26.
- the filter structure 20 may further comprise a section iterator for iterating the contemplated section 12 (e.g. po) of the disparity map D(p, t 0 ) over the disparity map or a part thereof.
- the contemplated section 12 (e.g. p 0 ) may correspond to a pixel of the disparity map D(p, to).
- the filter selector 22 may comprise an adaptive switching unit for switching between the first filter 24 and the second filter 26.
- Figure 3 shows a schematic flow diagram of a method for filtering a disparity map D(p, to) of a temporal sequence of disparity maps.
- the method comprises a step 302 of determining a local property of a contemplated section p 0 of the disparity map for the purpose of filtering.
- a first filter or a second filter is then selected at a step 304.
- the first and second filters are provided for filtering the contemplated section (e.g. pixel p 0 or a region surrounding pixel po). The selection is based on the at least one determined local property of the contemplated section.
- the method further comprises a step 306 of filtering the contemplated section po of the disparity map D(p, to) using the first filter or the second filter depending on a result of the selection the first filter or the second filter.
- the contemplated section is filtered using a first measure of central tendency or a second measure of central tendency.
- the first measure of central tendency may be a median and the second measure of central tendency may be an average.
- the method may further comprise: determining a binary mask for the disparity map on the basis of the local property, the binary mask indicating to the filter selector whether the first filter or the second filter is to be used for filtering the contemplated section (e.g. po).
- Filtering the contemplated section may comprise: weighting disparity values comprised in the contemplated section of the disparity map, for example using weighting factors w(p, t).
- the distance measure may be determined on the basis of a sum of color differences along a path from the contemplated section (e.g. p 0 ) to a further section.
- the method may further comprise: iterating the contemplated section (e.g. p 0 ) of the disparity map D(p, to) over the disparity map or a part thereof.
- the selection the first filter or the second filter may comprise a scene-adaptive switching between the first filter and the second filter in dependence on a local structure of color images and depth maps corresponding to the disparity map D(p, to).
- Figure 4 shows the effect of the scene-adaptive switching between weighted median and average filters.
- the top depicts an original image (left) and a magnified region with a large region of homogenous color (right).
- the large region of homogeneous color (yellow in the original color image) has been hatched in Figure 4.
- the left picture in the middle shows the initial disparity map in this region (boundaries of regions having different disparity values being enhanced for clearer representation). Note that it contains some matching noise although the object in the region under inspection refers to a plane in the 3D space.
- the right picture in the middle shows the results after applying the weighted median filter only to the initial disparity values.
- the depth discontinuity at the object border could be preserved due to the median properties, but the smoothing in the homogeneous color regions is still imperfect.
- the related picture at bottom shows the result for an adaptive switching between the two filter types. Note that the depth discontinuity are still preserved due to the usage of the weighted median in this area, whereas the smoothing in the homogeneous region is better now because the filter structure switches to weighted average in this region.
- the left picture at the bottom shows the binary mask Mask(p,to) that has been derived from the original color image (top right) and the initial disparity map (middle right) to control the adaptive switching: Black indicates "weighted average filter” and white indicates "weighted median filter”.
- the invention may also use a new kind of distance kernel.
- the usual one of conventional bilateral filters is replaced by a kernel weightdi st (p,po,t) that represents the costs of the cheapest path between all pixels in the filter window and its center pixel po at all frames t in the 3 -dimensional window.
- a path Pi(p,po , t) is a sequence of adjacent pixels that can be found for each frame t between an arbitrary pixel p in the filter window and its center pixel po by using an 8-connectivity operator.
- the index indicates that there is usually more than one possible path between these two points.
- the cost C(Pf) of a particular path P t is the sum of all absolute color differences along the path.
- Dist(p,p 0 , t) mm Pe ⁇ Pi (p po t) ⁇ ⁇ C(P p, p Q , t) ⁇ with ⁇ P t (p, p 0 , t) ⁇ indicating the set of all possible paths between p and po at frame t.
- Dist(p,po,t) describes the "cheapest" path from po to p at frame t.
- Figure 5 shows a performance comparison between distance/color kernel of conventional cross-bilateral filters and the invented filter method using the new distance function.
- Figure 5 gives an example for the improvements that be achieved with the new distance function.
- the left picture at top shows a black-and-white version of the original color image and the right picture is a magnification of a critical region as well as the corresponding disparity map.
- the magnified image region shows a part of the standing woman's head in front of the background. Note that the background contains an f-letter that has the same color as the woman's hair.
- the left pictures at the bottom show the weights of a conventional bilateral filter (for a window with the center pixel po in the f-letter at the background, see white dot at the intersection of the vertical and horizontal bars of the f-letter, as indicated by the arrow in the magnified image region at top) as well as the resulting filtered disparity map.
- the filter improves the alignment of the disparity map to the object border, but that it also aligns a wrong disparity (depth) to the f-letter in the background. The reason is that woman's hair and f-letter have almost the same color and that the distance between head and f-letter is not high enough to clearly separate these two objects.
- the pixels in the head are labeled with high weights in this case and the disparity value of the head's depth is wrongly aligned to the region of the f-letter.
- this misalignment can be avoided by using the new distance function.
- the costs of any path from the center pixel p to all pixels p in the head region are relatively high because of the blue background color between the woman's hair and the f-letter.
- the distance between head and f-letter is artificially increased and the high weights in the head region are removed.
- the correct disparity value referring to the background depth is now aligned to the f-letter.
- the underlying invention also introduces a new confidence kernel weight con f(p, t). It takes into account that confidence and reliability measures might be available from the matching process for each disparity value. Hence, they can be used to assign a high weight to reliable matches and, vice versa, low weights to matches with low confidence and poor reliability. In principle, any confidence and reliability measure can be taken in this context.
- ight conf (P, *) ⁇ n f D ( D mc (P» 0) ⁇ confj (I mc (p, t))
- a very usual confidence measure evaluating the reliability of estimated disparity maps is based on the left-right consistency. Assuming that both, left-to-right and right-to-left disparity maps are available, a consistency check can be carried out by calculating the following difference diffo(p,t):
- D mc r (p,t) and D mc r i(p,t) denote initial disparity maps which have been estimated from left to right stereo images and, vice versa, from right to left images at frame t, and have been motion compensated afterwards as described in Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel".
- p and D are 2-dimensional vectors in eq. (5) containing both, a horizontal and a vertical image component, and that, assuming a rectified stereo state, the vertical component of the disparity maps is usually equal to zero.
- the disparity related term of the confidence kernel can then be calculated by the reciprocal of the difference from the left-right consistency:
- I m ,i(p,t) and I m c,r(p ) mean the motion-compensated color images as de- scribed in Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel”.
- RMC Motion Compensated
- Figure 6 shows from left to right: a black-and-white version of the original color image; (a) a magnified disparity map in a critical region; (b) a filtered disparity map obtained by using conventional cross-bilateral filter; (c) a confidence kernel in this region; and (d) the improvement achieved by using additional confidence kernel.
- the left picture in Figure 6 again shows a black-and-white version of the initial color image followed by a magnified disparity map estimated in a critical region (picture (a) ).
- the next picture (b) shows the disparity map after conventional cross-bilateral filtering.
- the disparity map has clearly been improved but some mismatches remain (see black circle).
- the third picture (c) shows the confidence map in this region. Non-reliable matches are marked with black color. As these areas also cover the remaining mismatches after conventional filtering, they are removed with the new method using an additional confidence kernel. The improved result is shown in the right picture (d).
- a kernel weight te mp(p,t) enforcing temporal consistency may also be introduced in the new filter structure from Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel".
- This temporal kernel controls the influence of (temporally) adjacent frames to the final filter results and, with it, smoothes the results in temporal direction.
- temporal filtering is applied to the motion compensated disparity maps and is restricted to RMC pixels only (see Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel"): exp(-(— ) 2 ) if p is RMC pixel
- aspects described in the context of an apparatus it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
- Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
- embodiments of the invention can be implemented in hardware or in software.
- the implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a pro- grammable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable. Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
- embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer.
- the program code may for example be stored on a machine readable carrier.
- Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
- an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
- a further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
- the data carrier, the digital storage medium or the recorded medium are typically tangible and/or non- transitionary.
- a further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein.
- the data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
- a further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
- a processing means for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
- a further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
- a further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver.
- the receiver may, for example, be a computer, a mobile device, a memory device or the like.
- the apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver .
- a programmable logic device for example a field programmable gate array
- a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein.
- the methods are preferably performed by any hardware apparatus.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Image Processing (AREA)
Abstract
A filter structure for filtering a disparity map D(p, to) comprises a first filter (24), a second filter (26), and a filter selector (22). The first filter (24) is for filtering a contemplated section (12) of the disparity map D(p, to) according to a first measure of central tendency. The second filter (26) is for filtering the contemplated section (12) of the disparity maps according to a second measure of central tendency. The filter selector (22) is provided for selecting the first filter (24) or the second filter (26) for filtering the contemplated section (12) of the disparity map D(p, to), the selection being based on at least one local property of the contemplated section (12). A corresponding method for filtering a disparity map comprises determining a local property of the contemplated section (12) and selecting a filter. The contemplated section (12) is then filtered using the first filter (24) or the second filter (26) depending on a result of the selection.
Description
Spatio-temporal disparity-map smoothing by joint multilateral filtering
Description
Estimating dense pixel-by-pixel disparity maps from a pair of stereo-images is an active research topic since decades. A good review of current research in this field can be found in [1], [2] and [3] (see list of references below). One way to distinguish between the different approaches is to divide them into two categories, global and local methods. Local methods usually compare small patches in the left and right image to find the best match. Global approaches aim to find a globally optimal solution for the whole frame.
Almost all disparity estimation algorithms use some kind of post-processing to
• align disparity (depth) discontinuities to object borders
· remove matching noise and mismatches
• fill image areas with unmatched pixels
• enforce temporal consistency
One common approach is to use joint-bilateral filters. The idea is to calculate the filter- coefficients of a bilateral filter [4] scene-adaptively by using the color information of the original images and to apply the adaptive filter to the disparity maps.
While local methods usually compare small patches in the left and right image to find the best match, global approaches aim to find a globally optimal solution for the whole frame. Local correlation algorithms often produce noisy disparity maps with inaccurate aligned object borders, but offer the possibility to provide temporal consistency of the disparity maps. In contrast, global methods enable spatial smoothness and well aligned depth discontinuities at object borders, but they usually do not consider temporal consistency. As a result, generating virtual intermediate views from these disparity maps by applying Depth- Image Based Rendering (DIBR) to the original stereo-images often creates visually annoy- ing disturbances and artifacts.
Summary Embodiments of the present invention provide a filter structure for filtering a disparity map. The filter structure comprises a first filter and a second filter. The first filter is provided for filtering a contemplated section of the disparity map according to a first measure of central tendency. The second filter is for filtering the contemplated section of the dispar-
ity maps according to a second measure of central tendency. The filter structure further comprises a filter selector for selecting the first filter or the second filter for filtering the contemplated section of the disparity map. The selection done by the filter selector is based on at least one local property of the contemplated section.
Further embodiments provide a method for filtering a disparity map. The method comprises determining a local property of a contemplated section of the disparity map for the purpose of filtering. A first filter or a second filter is then selected for filtering the contemplated section, the selection being based on the at least one determined local property of the contemplated section. The method further comprises filtering the contemplated section of the disparity map using the first filter or the second filter depending on a result of the selection of the first filter or the second filter.
Further embodiments provide a computer readable digital storage medium having stored thereon a computer program having a program code for performing, when running on a computer, the method for filtering a disparity map mentioned above.
The present invention reduces visually annoying disturbances and artifacts, that are created by applying Depth-Image Based Rendering (DIBR) to the original stereo images, by intro- ducing a new joint multi-lateral filter to improve previously estimated disparity maps before using them for DIBR. The improvement refers to all three properties that are important for pleasant rendering results: spatial smoothness, temporal consistency and exact alignment of depth discontinuities to object borders. It can be applied to any kind of disparity map independently of the method by which they have been created.
Brief description of the Drawings
Fig. 1 shows an example of some motion compensated images (MCI's) in a symmetric cluster around reference frame to;
Fig. 2 shows a schematic block diagram of a filter structure according to embodiments;
Fig. 3 shows a schematic flow diagram of a method for filtering a disparity map according to embodiments;
Fig. 4 schematically illustrates an effect of scene-adaptive switching between weighted median and average filters;
Fig. 5 shows a performance comparison between distance/color kernel of conventional cross-bilateral filters and the invented filter method using the new distance function; Fig. 6 illustrates an improvement achieved by the introduction of the new confidence kernel.
Detailed description
As mentioned above, a common approach in post-processing an estimated disparity map is to use joint-bilateral filters. Filter-coefficients of such a bilateral filter [4] are calculated scene-adaptively by using the color information of the original images and to apply the adaptive filter to the disparity maps. However, in practice, there are several crucial drawbacks with this approach:
• Smoothing object borders
o In spite of the edge preserving property of bilateral filters by using scene- adaptively weighted filter coefficients, discontinuities in the disparity map at object borders might be somewhat smoothed due to averaging characteristics of conventional bilateral filter kernels.
• Introduction of false disparity values
o When using large filter windows it can happen that pixels inside the filter window, which have the same color as the center pixel but originate from another object at different depth, can corrupt the filter response (i.e., the output pixel is aligned to a wrong disparity value).
• Sensitivity to mismatches
o Usually bilateral filters do not take into account confidence measures from the disparity estimation process. As a consequence mismatches might re- main as wrong disparity values in the filtered disparity map, might affect the filter response in surrounding pixels, or might even be propagated in the filtered disparity map.
• Temporal consistency
o Regular cross-bilateral filters and global optimization approaches only work on a per frame basis and thus do not consider temporal consistency. In case of using local correlation methods that are able to provide temporal consistency on principle (e.g. by local temporal recursion), the subsequent ap-
plication of regular cross-bilateral filters can degrade or even remove the temporal consistency again.
To overcome these drawbacks the invention uses the following new multi-lateral filter structure instead of conventional bilateral filters.
Motion Compensation and Reliably Motion Compensated (RMC) Pixel
To keep temporal consistency in case of local disparity estimation (or to introduce it into global methods), the multi-lateral filter structure is based on spatio-temporal processing using motion compensation. In a straight-forward approach, motion compensation is applied to a symmetric cluster of N=2rt+1 frames in a temporal interval [to - rt ; to + rt] where t denotes the current frame. However, if necessary from the implementation point of view, it can also be applied to an asymmetric cluster in the temporal interval [to - rt : to] using current and previous frames only.
According to one possible implementation, a combination of LT-Tracking (Kanade- Lucas-Tomasi (feature) Tracking) and simple frame differencing is used to create motion compensated images (MCI). However, any other method of motion compensation can be taken as well. After motion compensation, corresponding pixels are at same position p in all MCI's as they are in current reference frame to. As an example, Figure 1 shows this condition for some images in a symmetric cluster with rt=10, N=21 and an interval [to - 10 ; to + 10]. In particular, Figure 1 shows an example of some MCI's in a symmetric cluster around reference frame to. Non-RMC pixels are marked with white color in Figure 1.
If color intensity does not change significantly at one pixel position over time after motion compensation, this pixel is labeled as reliably motion compensated (RMC) pixel. All other pixels (i.e. non RMC pixel) are marked with white color in the example from Figure 1. Note that, for labeling RMC pixels, further reliability measures can be used in addition or instead (e.g. implicit confidence measure of the used motion compensation method like the one available from LT tracking or consistency checks from between forward and backward motion estimation).
Adaptive Switching between Weighted Median and Average Filters
Since conventional cross-bilateral filters compute a scene-adaptively weighted average of disparity values in the filter window, some undesired smoothing of the depth maps at ob-
jects borders might occur. This smoothing of depth edges can be avoided by switching between weighted median and weighted average filter due to their following properties [6]. On one hand, weighted median filters preserve depth discontinuities better than weighted average filters at object borders. On the other hand, in regions far from object borders, e.g. in areas of homogeneous color, weighted average filter clearly outperform weighted median filters. Thus, switching between the two filter types has to be designed scene-adaptively in dependence on the local structure of color images and depth maps.
To achieve such a scene- adaptive performance, the underlying invention uses the following multi-lateral filter structure:
with
w(p, t) = weight Disl (p, p0, t) - weight conf (p, t)■ weight temp (p, t)
(1)
In general, the filter structure is based on a 3 -dimensional spatio-temporal window n around a center pixel position po and the reference frame to. Note that p and t denote pixels and frames within window n. Furthermore, the filter structure is applied to motion com- pensated disparity values Dmc(p,t). For this purpose, same motion compensation as for MCI generation in Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel" is applied to the disparity maps and the temporal expansion of window n coincides with the interval [to - rt ; to + r or [to - rt . t0J of motion compensation. At the output filtered disparity values Do(p,t) are calculated for central window position po at reference frame to.
As usual, the weighted averaging filter is an operation where the motion compensated disparity values Dmc(p,t) are multiplied by adaptive weighting factors w(p,t) before calculating the average score. In contrast, for the weighted median the frequencies of the motion com- pensated disparity values Dmc(p,t) are multiplied by weighting factors w(p,t) before calculating the median score. In both cases, the weighting factors have to be normalized such that their sum over the active window space equals to a pre-defined constant value.
As written in eq. (1), the related weighting factors depend on the three multiplicative ker- nels weight dist(p,t), w eight conj(p,t) and weighttemp(p,t). The meaning of these kernels will be explained in the next sections.
The scene-adaptive switching between the filter types is driven by a binary mask Mask(p,to) that is derived from color and disparity information in the reference frame to only. The following rules are used for computing Mask(p,to)'-
• The weighted median is used if
o the central pixel o is labeled as RMC pixel (see section "Motion Compensation and Reliably Motion Compensated Pixel")
o or a depth discontinuity is detected inside the filter window at reference frame to (e.g., maximal gradient of initial disparity values Dmc(p,to) exceeds a predefined threshold Thres_DepthDisc)
o or color values of original images differ significantly within the filter window at reference frame t (e.g., variance of color samples within the window exceeds a predefined threshold Thres_Var)
• The weighted average is used in all other cases (i.e., homogeneous color regions in reference frame to with significantly large distance to object borders)
Figure 2 shows a schematic block diagram of a filter structure 20 according to embodiments of the teachings disclosed herein. The filter structure 20 for filtering a disparity map D(p, to) of a temporal sequence of disparity maps D(p, t) comprises a first filter 24 for filtering a contemplated section 12 (for example: pixel po, or a group of pixels) of the disparity map D(p, t0) according to a first measure of central tendency; a second filter 26 for filtering the contemplated section 12 (e.g., p0) of the disparity maps D(p, to) according to a second measure of central tendency; and a filter selector 22 for selecting the first filter 24 or the second filter 26 for filtering the contemplated section 12 of the disparity map D(p, t0), the selection being based on at least one local property of the contemplated section 12. Note that the filtering may only affect the center pixel p0, or may also affect other pixels within a filter window (n) in which the pixel po is the center pixel. The filter structure 20 is configured to output a filtered disparity map Do(p, to) (i.e., the filtered disparity values) comprising a filtered contemplated section 92.
For example, the first filter 24 may be a median filter and filter the contemplated section 12 according to a median filtering scheme. The second filter 26 may be an average filter and configured to filter the contemplated section 12 according to an average filtering scheme.
The at least one local property may control a binary mask Mask(p, to) for the disparity map indicating to the filter selector 22 whether the first filter 24 or the second filter 26 is to be used for filtering the contemplated section. The at least one local property may comprise at least one of
- the contemplated section 12 being labeled as reliably motion compensated (RMC) by a motion compensation unit upstream of the filter structure;
a detection of a depth discontinuity within a filter window (n) that is used for filtering the contemplated section 12;
- a color inhomogeneity or gray value inhomogeneity within a filter window (n) of a color image or a gray value image corresponding to the filter window (n) of the disparity map D(p, t0) that is used for filtering the contemplated section 12; and
- a variance of color samples exceeding a threshold, the variance being deter- mined within a filter window (n) of a color image or a gray value image corresponding to the filter window (n) of the disparity map D(p, t0).
The first filter 24 may be a weighted first filter and the second filter 26 may be a weighted second filter.
A weighting performed by the weighted first filter 24 or the weighted second filter 26 may be based on at least one of
- a distance measure between the contemplated section 12 (e.g. p0) and a further section of the disparity map to be used for the weighted filtering;
- a confidence value for the contemplated section 12 (e.g. po) of the disparity map
D(p, t0); and
- a temporal consistency between the contemplated section 12 (e.g. p0) of the disparity map D(p, to) and a corresponding section or matching section in at least one of a preceding disparity map, several preceding disparity maps, a subsequent disparity map, and several subsequent disparity maps.
The distance measure may be determined on the basis of a sum of color differences along a path from the contemplated section 12 (e.g. p0) to the further section. A filter window (n) may be associated to the contemplated section 12 (e.g. po) of the disparity map D(p, to), the filter window (n) being a 3 -dimensional spatio-temporal window and defining a spatial extension and a temporal extension of filtering actions performed by the first filter 24 and the second filter 26.
The filter structure 20 may further comprise a section iterator for iterating the contemplated section 12 (e.g. po) of the disparity map D(p, t0) over the disparity map or a part thereof. The contemplated section 12 (e.g. p0) may correspond to a pixel of the disparity map D(p, to).
The filter selector 22 may comprise an adaptive switching unit for switching between the first filter 24 and the second filter 26.
Figure 3 shows a schematic flow diagram of a method for filtering a disparity map D(p, to) of a temporal sequence of disparity maps. The method comprises a step 302 of determining a local property of a contemplated section p0 of the disparity map for the purpose of filtering. A first filter or a second filter is then selected at a step 304. The first and second filters are provided for filtering the contemplated section (e.g. pixel p0 or a region surrounding pixel po). The selection is based on the at least one determined local property of the contemplated section. The method further comprises a step 306 of filtering the contemplated section po of the disparity map D(p, to) using the first filter or the second filter depending on a result of the selection the first filter or the second filter.
Depending on whether the step 304 of selecting has selected the first filter or the second filter, the contemplated section is filtered using a first measure of central tendency or a second measure of central tendency. For example, the first measure of central tendency may be a median and the second measure of central tendency may be an average.
The method may further comprise: determining a binary mask for the disparity map on the basis of the local property, the binary mask indicating to the filter selector whether the first filter or the second filter is to be used for filtering the contemplated section (e.g. po). Filtering the contemplated section (e.g. po) may comprise: weighting disparity values comprised in the contemplated section of the disparity map, for example using weighting factors w(p, t).
The distance measure may be determined on the basis of a sum of color differences along a path from the contemplated section (e.g. p0) to a further section.
The method may further comprise: iterating the contemplated section (e.g. p0) of the disparity map D(p, to) over the disparity map or a part thereof.
The selection the first filter or the second filter may comprise a scene-adaptive switching between the first filter and the second filter in dependence on a local structure of color images and depth maps corresponding to the disparity map D(p, to).
Figure 4 shows the effect of the scene-adaptive switching between weighted median and average filters. The top depicts an original image (left) and a magnified region with a large region of homogenous color (right). The large region of homogeneous color (yellow in the original color image) has been hatched in Figure 4. The left picture in the middle shows the initial disparity map in this region (boundaries of regions having different disparity values being enhanced for clearer representation). Note that it contains some matching noise although the object in the region under inspection refers to a plane in the 3D space. The right picture in the middle shows the results after applying the weighted median filter only to the initial disparity values. The depth discontinuity at the object border could be preserved due to the median properties, but the smoothing in the homogeneous color regions is still imperfect. In contrast, the related picture at bottom (right) shows the result for an adaptive switching between the two filter types. Note that the depth discontinuity are still preserved due to the usage of the weighted median in this area, whereas the smoothing in the homogeneous region is better now because the filter structure switches to weighted average in this region. In addition, the left picture at the bottom shows the binary mask Mask(p,to) that has been derived from the original color image (top right) and the initial disparity map (middle right) to control the adaptive switching: Black indicates "weighted average filter" and white indicates "weighted median filter".
New distance function
In addition the invention may also use a new kind of distance kernel. The usual one of conventional bilateral filters is replaced by a kernel weightdist(p,po,t) that represents the costs of the cheapest path between all pixels in the filter window and its center pixel po at all frames t in the 3 -dimensional window.
A path Pi(p,po,t) is a sequence of adjacent pixels that can be found for each frame t between an arbitrary pixel p in the filter window and its center pixel po by using an 8-connectivity operator. The index indicates that there is usually more than one possible path between these two points. The cost C(Pf) of a particular path Pt is the sum of all absolute color differences along the path. The distance of minimal cost can then be defined as follows:
Dist(p,p0, t) = mmPe{Pi (p po t)}{C(P p, pQ, t)} with {Pt (p, p0 , t)} indicating the set of all possible paths between p and po at frame t. Hence, Dist(p,po,t) describes the "cheapest" path from po to p at frame t.
Assuming that two disconnected objects of almost same color but with different depth have other regions of different colors in between, the above introduction of path costs into the distance kernel inhibits the influence of pixels in the filter window which do not belong to the same object and depth as the center pixel but have almost the same color. A similar distance function has already been used in [5] for controlling the size of adaptive measurement windows in stereo matching.
Figure 5 shows a performance comparison between distance/color kernel of conventional cross-bilateral filters and the invented filter method using the new distance function. In other words, Figure 5 gives an example for the improvements that be achieved with the new distance function. The left picture at top shows a black-and-white version of the original color image and the right picture is a magnification of a critical region as well as the corresponding disparity map. The magnified image region shows a part of the standing woman's head in front of the background. Note that the background contains an f-letter that has the same color as the woman's hair.
The left pictures at the bottom show the weights of a conventional bilateral filter (for a window with the center pixel po in the f-letter at the background, see white dot at the intersection of the vertical and horizontal bars of the f-letter, as indicated by the arrow in the magnified image region at top) as well as the resulting filtered disparity map. It can be seen that the filter improves the alignment of the disparity map to the object border, but that it also aligns a wrong disparity (depth) to the f-letter in the background. The reason is that woman's hair and f-letter have almost the same color and that the distance between head and f-letter is not high enough to clearly separate these two objects. Hence, the pixels in the head are labeled with high weights in this case and the disparity value of the head's depth is wrongly aligned to the region of the f-letter.
As shown in the right pictures at the bottom this misalignment can be avoided by using the new distance function. The costs of any path from the center pixel p to all pixels p in the head region are relatively high because of the blue background color between the woman's hair and the f-letter. Thus, the distance between head and f-letter is artificially increased and the high weights in the head region are removed. As a consequence, the correct disparity value referring to the background depth is now aligned to the f-letter.
Confidence kernel
Apart from the new distance function, the underlying invention also introduces a new confidence kernel weightconf(p, t). It takes into account that confidence and reliability measures might be available from the matching process for each disparity value. Hence, they can be used to assign a high weight to reliable matches and, vice versa, low weights to matches with low confidence and poor reliability. In principle, any confidence and reliability measure can be taken in this context.
As an example, the weight of the confidence kernel can use two terms, one referring to the reliability of the disparity maps D and another one evaluating color matches in the original images /: ight conf (P, *) =∞nfD (Dmc (P» 0) · confj (Imc (p, t)) A very usual confidence measure evaluating the reliability of estimated disparity maps is based on the left-right consistency. Assuming that both, left-to-right and right-to-left disparity maps are available, a consistency check can be carried out by calculating the following difference diffo(p,t):
Here, Dmc r(p,t) and Dmc ri(p,t) denote initial disparity maps which have been estimated from left to right stereo images and, vice versa, from right to left images at frame t, and have been motion compensated afterwards as described in Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel". Note that p and D are 2-dimensional vectors in eq. (5) containing both, a horizontal and a vertical image component, and that, assuming a rectified stereo state, the vertical component of the disparity maps is usually
equal to zero. The disparity related term of the confidence kernel can then be calculated by the reciprocal of the difference from the left-right consistency:
0 if diffD (p, t) > ThresLeftRight
confD{Dmc{p, t)) = 1 if diffD(p, t) = Q
1 / dijfD (p, t) elsewhere
Similar to eq. (5), one can define a difference diff(p,t) between color matches:
diffj (p, t) =
In this context Im ,i(p,t) and Imc,r(p ) mean the motion-compensated color images as de- scribed in Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel". A confidence kernel related to color matches can then be defined accordingly:
0 if diffj (p, t) > ThresColorMatch
confi (Imc (P> t)) = 1 if diff (p, t) = 0
1 / diffj (p, t) elsewhere The results in Figure 6 demonstrate the improvements that can be achieved by using the new confidence kernel. In particular, Figure 6 shows from left to right: a black-and-white version of the original color image; (a) a magnified disparity map in a critical region; (b) a filtered disparity map obtained by using conventional cross-bilateral filter; (c) a confidence kernel in this region; and (d) the improvement achieved by using additional confidence kernel.
The left picture in Figure 6 again shows a black-and-white version of the initial color image followed by a magnified disparity map estimated in a critical region (picture (a) ). Note that there are a lot of crucial mismatches between the arm of the man in the foreground and the back of the woman behind him. The next picture (b) shows the disparity map after conventional cross-bilateral filtering. The disparity map has clearly been improved but some mismatches remain (see black circle). The third picture (c) shows the confidence map in this region. Non-reliable matches are marked with black color. As these areas also cover the remaining mismatches after conventional filtering, they are removed with the new method using an additional confidence kernel. The improved result is shown in the right picture (d).
Temporal consistency
A kernel weight temp(p,t) enforcing temporal consistency may also be introduced in the new filter structure from Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel". This temporal kernel controls the influence of (temporally) adjacent frames to the final filter results and, with it, smoothes the results in temporal direction. However, to prevent smoothing over moving object borders, temporal filtering is applied to the motion compensated disparity maps and is restricted to RMC pixels only (see Section "Motion Compensation and Reliably Motion Compensated (RMC) Pixel"): exp(-(— )2 ) if p is RMC pixel
weighttmp(p, t) = < 1 if p is no RMC pixel and t = t0
0 if p is no RMC pixel and t≠ t0
Note that non RMC pixels are excluded from temporal filtering. For these pixels the filter process degenerates to 2-dimensional filtering only applied to reference frame to.
Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus. Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disk, a DVD, a Blu-Ray, a CD, a ROM, a PROM, an EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a pro- grammable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed. Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may for example be stored on a machine readable carrier. Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
A further embodiment of the inventive methods is, therefore, a data carrier (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non- transitionary.
A further embodiment of the inventive method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may for example be configured to be transferred via a data communication connection, for example via the Internet.
A further embodiment comprises a processing means, for example a computer, or a programmable logic device, configured to or adapted to perform one of the methods described herein.
A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system
may, for example, comprise a file server for transferring the computer program to the receiver .
In some embodiments, a programmable logic device (for example a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are preferably performed by any hardware apparatus. The above described embodiments are merely illustrative for the principles of the present invention. It is understood that modifications and variations of the arrangements and the details described herein will be apparent to others skilled in the art. It is the intent, therefore, to be limited only by the scope of the impending patent claims and not by the specific details presented by way of description and explanation of the embodiments herein.
REFERENCES:
[1] D. Scharstein and R. Szelisky, "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms" ,IJCV, vol. 47, no.1-3, pp. 7-42, 2002.
[2] M.Z. Brown, D. Burschka, and G. D. Hager. "Advances in computational stereo". IEEE Trans. Pattern Analysis and Machine Intelligence, 25(8):993- 1008,2003
[3] S.M. SeitzM.Z. B. Curless, J. Diebel, D. Scharstein, and R.Szeliski. "A comparison and evaluation of multi-view stereo reconstruction algorithms". In Proc. IEEE Conf. Comp. Vision and Pattern Recognition, pages 519-528, 2006
[4] C. Tomasi and R. Manduchi. "Bilateral Filtering for Gray and Color Images". In Proceedings of the IEEE International Conference on Computer Vision, 1998.
[5] Hosni, M. Bleyer, M. Gelautz, and C. Rhemann. "Local stereo matching using geodesic support weights". ICIP 2009
[6] M. Mueller, F. Zilly, and P. Kauff, "Adaptive cross-trilateral depth map filtering," in 3DTV-Conference: The True Vision- Capture, Transmission and Display of 3D Video (3DTVCON' 10), jun. 2010, pp. 1-4.
Claims
1. A filter structure (20) for filtering a disparity map (D(p, to)), the filter structure (20) comprising a first filter (24) for filtering a contemplated section (12) of the disparity map (D(p, to)) according to a first measure of central tendency; a second filter (26) for filtering the contemplated section (12) of the disparity map (D(p, to)) according to a second measure of central tendency; and a filter selector (22) for selecting the first filter (24) or the second filter (26) for filtering the contemplated section (12) of the disparity map (D(p, to)), the selection being based on at least one local property of the contemplated section (12).
2. The filter structure (20) according to claim 1, wherein the first filter (24) is a median filter and the first measure of central tendency is a median.
3. The filter structure (20) according to claim 1 or 2, wherein the second filter (26) is an average filter and the second measure of central tendency is an average.
4. The filter structure (20) according to any one of claims 1 to 3, wherein the first filter (24) is a median filter and the first measure of central tendency is a median, and wherein the second filter (26) is an average filter and the second measure of central tendency is an average.
5. The filter structure (20) according to any one of claims 1 to 4, wherein the at least one local property controls a binary mask for the disparity map (D(p, to)) indicating to the filter selector whether the first filter (24) or the second filter (26) is to be used for filtering the contemplated section (12).
6. The filter structure (20) according to any one of claims 1 to 5, wherein the at least one local property comprises at least one of - the contemplated section being labeled as reliably motion compensated (RMC) by a motion compensation unit upstream of the filter structure; a detection of a depth discontinuity within a filter window that is used for fil- tering the contemplated section (12); a color inhomogeneity or gray value inhomogeneity within a filter window of a color image or a gray value image corresponding to the filter window of the disparity map that is used for filtering the contemplated section (12); and a variance of color samples exceeding a threshold, the variance being determined within a filter window of a color image or a gray value image corresponding to the filter window of the disparity map (D(p, to)).
7. The filter structure (20) according to any one of claims 1 to 6, wherein at least one of the first filter (24) and the second filter (26) is a weighted filter.
8. The filter structure (20) according to claim 7, wherein the disparity map (D(p, to)) is part of a temporal sequence of disparity maps and wherein a weighting performed by the weighted filter or the weighted filters is based on at least one of
- a distance measure between the contemplated section (12) and a further section of the disparity map (D(p, to)) to be used for the weighted filtering; - a confidence value for the contemplated section (12) of the disparity map
(D(p, to)); and
- a temporal consistency between the contemplated section (12) of the disparity map (D(p, to)) and at least one of a preceding disparity map, several preceding disparity maps, a subsequent disparity map, and several subsequent disparity maps.
9. The filter structure (20) according to claim 8, wherein the distance measure is determined on the basis of a sum of color differences along a path from the contemplated section (12) to the further section.
10. The filter structure (20) according to any one of claims 1 to 9, wherein a filter window is associated to the contemplated section (12) of the disparity map (D(p, t0)), the filter window being a 3 -dimensional spatio-temporal window and defining a spatial extension and a temporal extension of filtering actions performed by the first filter (24) and the second filter (26).
11. The filter structure (20) according to any one of claims 1 to 10, further comprising a section iterator for iterating the contemplated section of the disparity map over the disparity map (D(p, to)) or a part thereof.
12. The filter structure (20) according to any one of claims 1 to 11, wherein the contemplated section (12) corresponds to a pixel of the disparity map (D(p, to)).
13. The filter structure (20) according to any one of claims 1 to 12, wherein the filter (22) selector comprises an adaptive switching unit for switching between the first filter (24) and the second filter (26).
14. A method for filtering a disparity map (D(p, to)), the method comprising determining (302) a local property of a contemplated section (12) of the disparity map (D(p, to)) for the purpose of filtering; selecting (304) a first filter (24) or a second filter (26) for filtering the contemplated section (12), the selection being based on the at least one determined local property of the contemplated section (12); filtering (306) the contemplated section (12) of the disparity map (D(p, to)) using the first filter (24) or the second filter (26) depending on a result of selecting the first filter or the second filter.
15. The method according to claim 14, wherein the first filter (24) is a median filter and the first measure of central tendency is a median.
16. The method according to claim 14 or 15, wherein the second filter (26) is an average filter and the second measure of central tendency is an average.
17. The method according to any one of claims 14 to 16, wherein the first filter (24) is a median filter and the first measure of central tendency is a median, and wherein the second filter (26) is an average filter and the second measure of central tendency is an average.
18. The method according to any one of claims 14 to 17, further comprising: determin- ing a binary mask for the disparity map (D(p, to)) on the basis of the local property, the binary mask indicating to the filter selector whether the first filter (24) or the second filter (26) is to be used for filtering the contemplated section (12).
19. The method according to any one of claims 14 to 18, wherein the at least one local property comprises at least one of
- the contemplated section (12) being labeled as reliably motion compensated (RMC) by a motion compensation unit upstream of the filter structure; - a detection of a depth discontinuity within a filter window that is used for filtering the contemplated section (12); a color inhomogeneity or gray value inhomogeneity within a filter window of a color image or a gray value image corresponding to the filter window of the disparity map that is used for filtering the contemplated section (12); and
- a variance of color samples exceeding a threshold, the variance being determined within a filter window of a color image or a gray value image corresponding to the filter window of the disparity map (D(p, t0)).
20. The method according to any one of claims 14 to 19, wherein filtering the contemplated section (12) comprises weighting disparity values comprised in the contemplated section (12) of the disparity map (D(p, to)).
21. The method according to any one of claims 14 to 20, wherein the disparity map (D(p, to)) is part of a temporal sequence of disparity maps, and wherein the weighting is based on at least one of
- a distance measure between the contemplated section (12) and a further section of the disparity map (D(p, to)) to be used for the weighted filtering;
- a confidence value for the contemplated section (12) of the disparity map (D(p, to)); and - a temporal consistency between the contemplated section (12) of the disparity map (D(p, to)) and at least one of a preceding disparity map, several preceding disparity maps, a subsequent disparity map, and several subsequent disparity maps.
22. The method according to claim 21 , wherein the distance measure is determined on the basis of a sum of color differences along a path from the contemplated section (12) to the further section.
23. The method according to any one of claims 14 to 22, wherein a filter window is associated to the contemplated section of the disparity map, the filter window being a 3- dimensional spatio-temporal window and defining a spatial extension and a temporal extension of filtering actions performed by the first filter (24) and the second filter (26).
24. The method according to any one of claims 14 to 23, further comprising: iterating the contemplated section (12) of the disparity map over the disparity map (D(p, t0)) or a part thereof.
25. The method according to any one of claims 14 to 24, wherein the contemplated section (12) corresponds to a pixel of the disparity map (D(p, to)).
26. The method according to any one of claims 14 to 25, wherein the selection the first filter (24) or the second filter (26) comprises a scene-adaptive switching between the first filter and the second filter in dependence on a local structure of color images and depth maps corresponding to the disparity map (D(p, t0)).
27. A computer readable digital storage medium having stored thereon a computer program having a program code for performing, when running on a computer, the method according to any one of claims 14 to 26.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP12801500.5A EP2786580B1 (en) | 2011-11-30 | 2012-11-29 | Spatio-temporal disparity-map smoothing by joint multilateral filtering |
| US14/287,262 US9361677B2 (en) | 2011-11-30 | 2014-05-27 | Spatio-temporal disparity-map smoothing by joint multilateral filtering |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201161564919P | 2011-11-30 | 2011-11-30 | |
| US61/564,919 | 2011-11-30 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/287,262 Continuation US9361677B2 (en) | 2011-11-30 | 2014-05-27 | Spatio-temporal disparity-map smoothing by joint multilateral filtering |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2013079602A1 true WO2013079602A1 (en) | 2013-06-06 |
Family
ID=47358124
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2012/073979 Ceased WO2013079602A1 (en) | 2011-11-30 | 2012-11-29 | Spatio-temporal disparity-map smoothing by joint multilateral filtering |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US9361677B2 (en) |
| EP (1) | EP2786580B1 (en) |
| WO (1) | WO2013079602A1 (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103916656A (en) * | 2014-03-13 | 2014-07-09 | 华中科技大学 | Image rendering method by utilizing depth image |
| EP2860975A1 (en) | 2013-10-09 | 2015-04-15 | Thomson Licensing | Method for processing at least one disparity map, corresponding electronic device and computer program product |
| US9756312B2 (en) | 2014-05-01 | 2017-09-05 | Ecole polytechnique fédérale de Lausanne (EPFL) | Hardware-oriented dynamically adaptive disparity estimation algorithm and its real-time hardware |
| US10395343B2 (en) | 2014-11-20 | 2019-08-27 | Commissariat A L'energie Atomique Et Aux Energies Alternatives | Method and device for the real-time adaptive filtering of noisy depth or disparity images |
| EP3979203A4 (en) * | 2019-07-11 | 2022-08-03 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | DEPTH MAP PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE AND READABLE STORAGE MEDIA |
Families Citing this family (21)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9098908B2 (en) | 2011-10-21 | 2015-08-04 | Microsoft Technology Licensing, Llc | Generating a depth map |
| CN104205827B (en) * | 2012-03-30 | 2016-03-16 | 富士胶片株式会社 | Image processing apparatus and method and camera head |
| CN104272732B (en) | 2012-05-09 | 2016-06-01 | 富士胶片株式会社 | Image processing apparatus, method and shooting device |
| WO2013173282A1 (en) * | 2012-05-17 | 2013-11-21 | The Regents Of The University Of Califorina | Video disparity estimate space-time refinement method and codec |
| US20140241612A1 (en) * | 2013-02-23 | 2014-08-28 | Microsoft Corporation | Real time stereo matching |
| US20160073094A1 (en) * | 2014-09-05 | 2016-03-10 | Microsoft Corporation | Depth map enhancement |
| CN105517677B (en) * | 2015-05-06 | 2018-10-12 | 北京大学深圳研究生院 | The post-processing approach and device of depth map/disparity map |
| WO2017014693A1 (en) * | 2015-07-21 | 2017-01-26 | Heptagon Micro Optics Pte. Ltd. | Generating a disparity map based on stereo images of a scene |
| US10699476B2 (en) * | 2015-08-06 | 2020-06-30 | Ams Sensors Singapore Pte. Ltd. | Generating a merged, fused three-dimensional point cloud based on captured images of a scene |
| TWI608447B (en) * | 2015-09-25 | 2017-12-11 | 台達電子工業股份有限公司 | Stereo image depth map generation device and method |
| CN105303536A (en) * | 2015-11-26 | 2016-02-03 | 南京工程学院 | Median filtering algorithm based on weighted mean filtering |
| US10839535B2 (en) | 2016-07-19 | 2020-11-17 | Fotonation Limited | Systems and methods for providing depth map information |
| US10462445B2 (en) * | 2016-07-19 | 2019-10-29 | Fotonation Limited | Systems and methods for estimating and refining depth maps |
| KR102455632B1 (en) * | 2017-09-14 | 2022-10-17 | 삼성전자주식회사 | Mehtod and apparatus for stereo matching |
| EP3462408A1 (en) | 2017-09-29 | 2019-04-03 | Thomson Licensing | A method for filtering spurious pixels in a depth-map |
| US11436706B2 (en) * | 2018-03-19 | 2022-09-06 | Sony Corporation | Image processing apparatus and image processing method for improving quality of images by removing weather elements |
| CN110400344B (en) * | 2019-07-11 | 2021-06-18 | Oppo广东移动通信有限公司 | Depth map processing method and device |
| KR102455833B1 (en) * | 2020-02-12 | 2022-10-19 | 한국전자통신연구원 | Method and apparatus for enhancing disparity map |
| US11308349B1 (en) * | 2021-10-15 | 2022-04-19 | King Abdulaziz University | Method to modify adaptive filter weights in a decentralized wireless sensor network |
| GB2624542B (en) * | 2023-10-24 | 2024-11-13 | Opteran Tech Limited | Stabilised 360 degree depth imaging system without image rectification |
| CN120543444B (en) * | 2025-07-28 | 2025-09-26 | 英纳瑞医疗科技(上海)有限公司 | Artificial intelligence-based bone mineral density image processing method, system, equipment and medium |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2009061305A1 (en) * | 2007-11-09 | 2009-05-14 | Thomson Licensing | System and method for depth map extraction using region-based filtering |
| US20100271511A1 (en) * | 2009-04-24 | 2010-10-28 | Canon Kabushiki Kaisha | Processing multi-view digital images |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110032341A1 (en) * | 2009-08-04 | 2011-02-10 | Ignatov Artem Konstantinovich | Method and system to transform stereo content |
| WO2011033673A1 (en) * | 2009-09-18 | 2011-03-24 | 株式会社 東芝 | Image processing apparatus |
-
2012
- 2012-11-29 EP EP12801500.5A patent/EP2786580B1/en active Active
- 2012-11-29 WO PCT/EP2012/073979 patent/WO2013079602A1/en not_active Ceased
-
2014
- 2014-05-27 US US14/287,262 patent/US9361677B2/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2009061305A1 (en) * | 2007-11-09 | 2009-05-14 | Thomson Licensing | System and method for depth map extraction using region-based filtering |
| US20100271511A1 (en) * | 2009-04-24 | 2010-10-28 | Canon Kabushiki Kaisha | Processing multi-view digital images |
Non-Patent Citations (6)
| Title |
|---|
| D. SCHARSTEIN; R. SZELISKY: "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", IJCV, vol. 47, no. 1 -3, 2002, pages 7 - 42 |
| HOSNI; M. BLEYER; M. GELAUTZ; C. RHEMANN: "Local stereo matching using geodesic support weights", ICIP, 2009 |
| M. MUELLER; F. ZILLY; P. KAUFF: "Adaptive cross-trilateral depth map filtering", 3DTV-CONFERENCE: THE TRUE VISION- CAPTURE, TRANSMISSION AND DISPLAY OF 3D VIDEO (3DTVCON' 10, June 2010 (2010-06-01), pages 1 - 4, XP031706363 |
| M.Z. BROWN; D. BURSCHKA; G. D. HAGER: "Advances in computational stereo", IEEE TRANS. PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 25, no. 8, 2003, pages 993 - 1008, XP011099378, DOI: doi:10.1109/TPAMI.2003.1217603 |
| S.M. SEITZ; M.Z. B. CURLESS; J. DIEBEL; D. SCHARSTEIN; R.SZELISKI: "A comparison and evaluation of multi-view stereo reconstruction algorithms", PROC. IEEE CONF. COMP.VISION AND PATTERN RECOGNITION, 2006, pages 519 - 528, XP010922864, DOI: doi:10.1109/CVPR.2006.19 |
| TOMASI; R. MANDUCHI: "Bilateral Filtering for Gray and Color Images", PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION, 1998 |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2860975A1 (en) | 2013-10-09 | 2015-04-15 | Thomson Licensing | Method for processing at least one disparity map, corresponding electronic device and computer program product |
| CN103916656A (en) * | 2014-03-13 | 2014-07-09 | 华中科技大学 | Image rendering method by utilizing depth image |
| CN103916656B (en) * | 2014-03-13 | 2016-01-20 | 华中科技大学 | One utilizes depth map to carry out image drawing method |
| US9756312B2 (en) | 2014-05-01 | 2017-09-05 | Ecole polytechnique fédérale de Lausanne (EPFL) | Hardware-oriented dynamically adaptive disparity estimation algorithm and its real-time hardware |
| US10395343B2 (en) | 2014-11-20 | 2019-08-27 | Commissariat A L'energie Atomique Et Aux Energies Alternatives | Method and device for the real-time adaptive filtering of noisy depth or disparity images |
| EP3979203A4 (en) * | 2019-07-11 | 2022-08-03 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | DEPTH MAP PROCESSING METHOD AND APPARATUS, ELECTRONIC DEVICE AND READABLE STORAGE MEDIA |
Also Published As
| Publication number | Publication date |
|---|---|
| US20140270485A1 (en) | 2014-09-18 |
| EP2786580A1 (en) | 2014-10-08 |
| US9361677B2 (en) | 2016-06-07 |
| EP2786580B1 (en) | 2015-12-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP2786580B1 (en) | Spatio-temporal disparity-map smoothing by joint multilateral filtering | |
| US11562498B2 (en) | Systems and methods for hybrid depth regularization | |
| US9111389B2 (en) | Image generation apparatus and image generation method | |
| US9171372B2 (en) | Depth estimation based on global motion | |
| US20130127844A1 (en) | Filling disocclusions in a virtual view | |
| Ansar et al. | Enhanced real-time stereo using bilateral filtering | |
| WO2014072926A1 (en) | Generation of a depth map for an image | |
| EP2880634A1 (en) | Video communication with three dimensional perception | |
| CN110268712A (en) | Method and apparatus for processing image property maps | |
| Ceulemans et al. | Robust multiview synthesis for wide-baseline camera arrays | |
| Tallón et al. | Upsampling and denoising of depth maps via joint-segmentation | |
| WO2013173282A1 (en) | Video disparity estimate space-time refinement method and codec | |
| Furihata et al. | Novel view synthesis with residual error feedback for FTV | |
| US20150350669A1 (en) | Method and apparatus for improving estimation of disparity in a stereo image pair using a hybrid recursive matching processing | |
| CN107680083B (en) | Parallax determination method and parallax determination device | |
| Köppel et al. | Temporally consistent adaptive depth map preprocessing for view synthesis | |
| Robert et al. | Disparity-compensated view synthesis for s3D content correction | |
| Lin et al. | Interactive disparity map post-processing | |
| Chen et al. | Improving Graph Cuts algorithm to transform sequence of stereo image to depth map | |
| Song et al. | Depth map boundary filter for enhanced view synthesis in 3D video | |
| Wei et al. | Video synthesis from stereo videos with iterative depth refinement | |
| Lin et al. | Spatio-temporally consistent multi-view video synthesis for autostereoscopic displays | |
| Zuo et al. | A refined weighted mode filtering approach for depth video enhancement | |
| Zhu et al. | Temporally consistent disparity estimation using PCA dual-cross-bilateral grid | |
| Zhao et al. | Virtual view synthesis and artifact reduction techniques |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12801500 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2012801500 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |