RU2423018C2

RU2423018C2 - Method and system to convert stereo content

Info

Publication number: RU2423018C2
Application number: RU2009129700/09A
Authority: RU
Inventors: Артем Константинович ИГНАТОВ (RU); Артем Константинович Игнатов; Оксана Васильевна ДЖОСАН (RU); Оксана Васильевна ДЖОСАН
Original assignee: Корпорация "САМСУНГ ЭЛЕКТРОНИКС Ко., Лтд."
Priority date: 2009-08-04
Filing date: 2009-08-04
Publication date: 2011-06-27
Also published as: RU2009129700A; KR20110014067A

Abstract

FIELD: information technologies.

SUBSTANCE: initial chart of disparity/depth is calculated for a stereo image from 3D video; the depth chart is smoothened; parameters of depth perception are varied in compliance with estimation of eye fatigue; a new stereo image is generated in compliance with parameters of depth perception. The system of stereo content conversion for the purpose to reduce eye fatigue when viewing 3D video comprises a unit of depth chart calculation and smoothening, a unit of depth control, a unit of visualisation, at the same time the first outlet of the calculation unit and the depth chart smoothening is connected to the first inlet of the visualisation unit, the second outlet of the depth chart calculation and smoothening is connected to the inlet of depth control unit, the outlet of which is connected to the second inlet of the visualisation unit.

EFFECT: development of a device and a method to convert stereo content for reduction of eye fatigue when viewing 3D video and provision of stable control of depth perception during demonstration at a 3D television receiver.

22 cl, 22 dwg, 1 tbl

Description

Заявляемое изобретение относится к способам и системам для обработки стереоизображений и видеоинформации и, в частности, к способам и устройствам для преобразования стереоконтента в целях снижения усталости глаз при просмотре трехмерного видео.The claimed invention relates to methods and systems for processing stereo images and video information and, in particular, to methods and devices for converting stereo content in order to reduce eye fatigue when watching three-dimensional video.

Ожидается, что трехмерное (объемное) телевидение придет на смену современной телевизионной технике, демонстрируя зрителю не только видеоряд двумерных изображений, но и изображения объемных сцен. Одно из требований к функциональным возможностям трехмерного телевизионного устройства - обеспечение возможности изменения глубины в соответствии с пожеланиями зрителя в отношении повышенного комфорта при просмотре. Для обеспечения возможности управления глубиной изображения необходимо решить задачу синтеза новых видов (изображений). Новые виртуальные виды синтезируются, используя информацию, получаемую из карты диспарантности/глубины, которая вычисляется на основе входящих стереопар изображений. Корректное вычисление диспарантности - очень сложная задача, потому что качество синтезируемой стереопары с измененной глубиной в значительной степени зависит от качества карты глубины. Таким образом, требуется применить некий способ совмещения стереопар для генерации необработанной (исходной) карты диспарантности/глубины с последующей обработкой, чтобы иметь возможность применять этот способ для синтеза виртуальных видов в процессе демонстрации трехмерного контента.It is expected that three-dimensional (surround) television will replace modern television technology, showing the viewer not only the video sequence of two-dimensional images, but also images of three-dimensional scenes. One of the requirements for the functionality of a three-dimensional television device is to provide the ability to change depth in accordance with the wishes of the viewer in relation to increased viewing comfort. To provide the ability to control the image depth, it is necessary to solve the problem of the synthesis of new species (images). New virtual views are synthesized using information obtained from the disparity / depth map, which is calculated based on incoming stereo pairs of images. The correct calculation of the disparity is a very difficult task, because the quality of the synthesized stereo pair with a changed depth largely depends on the quality of the depth map. Thus, it is required to apply a certain method of combining stereo pairs to generate an unprocessed (initial) disparity / depth map with subsequent processing in order to be able to use this method to synthesize virtual views in the process of demonstrating three-dimensional content.

Вычисление диспарантности или процедура совмещениия стереопары сводится к задаче по обнаружению попиксельного (точка-с-точкой) соответствия в стереовидах. На вход поступают два или более изображения от множества камер, а на выходе получают карту связей (карту диспарантности), которая отображает соответствие каждой точки одного изображения подобной точке на другом изображении. Получаемая диспарантность будет большой для близлежащих объектов и будет выражаться малой величиной для удаленных объектов. Таким образом, карту диспарантности можно рассматривать как инверсию глубины сцены.The calculation of disparity or the procedure of combining a stereo pair is reduced to the task of detecting pixel-by-pixel (point-to-point) correspondence in stereo types. Two or more images from a plurality of cameras enter the input, and a communications map (disparity map) is received at the output, which displays the correspondence of each point of one image to a similar point on another image. The resulting disparity will be large for nearby objects and will be expressed as a small value for remote objects. Thus, the disparity map can be considered as an inversion of the scene depth.

Алгоритмы совмещения стереопары можно разделить на локальные алгоритмы (работающие с окрестностями текущего пикселя) и глобальные (работающие со всем изображением) [4]. Локальные алгоритмы исходят из предположения о том, что вычисляемая функция диспарантности является гладкой в окне поддержки. Результаты выполнения таких алгоритмов являются обычно не очень точными, но приемлемыми для применения в режиме реального времени. С другой стороны, глобальные алгоритмы используют явно заданную функцию гладкости и затем решают оптимизационную задачу. Это обычно требует использования сложных вычислительных методов, таких как динамическое программирование или алгоритмы сечения графа.Stereopair matching algorithms can be divided into local algorithms (working with the neighborhood of the current pixel) and global (working with the whole image) [4]. Local algorithms are based on the assumption that the calculated disparity function is smooth in the support window. The results of such algorithms are usually not very accurate, but acceptable for real-time use. On the other hand, global algorithms use an explicitly defined smoothness function and then solve the optimization problem. This usually requires the use of complex computational methods, such as dynamic programming or graph section algorithms.

Недавно разработанный способ оценки диспарантности, описанный в американской патентной заявке №2006/0120594 [1], состоит из двух основных этапов: локальное совмещение и глобальная оптимизация. Локальное совмещение выявляет пиксели с высокой степенью совместимости. Во время выполнения этого этапа пиксели с низкой совместимостью обнуляются. После чего для оценки окончательного значения карты глубины используется глобальная оптимизация. Глобальная оптимизация выполняется с использованием двухпроходного метода динамического программирования, при этом первый проход предназначен для горизонтального направления, а второй - для вертикального направления. Недостаток этого способа заключается в том, что он требует высоких вычислительных ресурсов. Требуется вычислить столько уровней диспарантности, сколько их содержится в самом близком объекте в данной сцене. Для камер с высоким разрешением и большим фокусным расстоянием число подлежащих вычислению диспарантностей может лежать в диапазоне от 100 до 300. Также в процессе локального совмещения применяют ряд направленных фильтров, чтобы определить вес совместимости. И, наконец, процесс глобальной оптимизации предъявляет высокие требования к вычислительным ресурсам, даже когда он применяется самостоятельно (отдельно).The recently developed disparity assessment method described in US Patent Application No. 2006/0120594 [1] consists of two main steps: local alignment and global optimization. Local alignment reveals pixels with a high degree of compatibility. During this step, pixels with low compatibility are reset to zero. After that, global optimization is used to estimate the final value of the depth map. Global optimization is performed using the two-pass method of dynamic programming, with the first pass intended for the horizontal direction, and the second for the vertical direction. The disadvantage of this method is that it requires high computing resources. It is required to calculate as many levels of disparity as they are contained in the closest object in a given scene. For cameras with high resolution and large focal lengths, the number of disparities to be calculated can range from 100 to 300. In the process of local alignment, a number of directional filters are used to determine the compatibility weight. And finally, the global optimization process places high demands on computing resources, even when it is applied independently (separately).

Еще одно решение, касающееся последовательного совмещения стереопар, можно найти в американском патенте №7106899 [2]. Изобретение использует итерационную технологию, которая включает в себя принцип ограничения градиента диспарантности и стратегию наименьших обязательств. Идея состоит в том, чтобы переходить к высокодостоверной карте диспарантности от пикселей с однозначным совмещением через многочисленные итерации. Этот способ является последовательным в смысле последовательного уменьшения числа пикселей, которые придется проанализировать на каждой итерации. Однозначные совмещения пикселей находят, используя новую технологию корреляции и основываясь на метке корреляции, привязанной к совмещению пикселей. Недостаток этого способа заключается в высоких требованиях к вычислительным ресурсам.Another solution regarding the sequential alignment of stereo pairs can be found in US patent No. 7106899 [2]. The invention uses iterative technology, which includes the principle of limiting the disparity gradient and the strategy of least commitment. The idea is to switch to a highly reliable pixel disparity map with unique alignment through multiple iterations. This method is sequential in the sense of sequentially reducing the number of pixels that will have to be analyzed at each iteration. Unambiguous pixel alignments are found using the new correlation technology and based on the correlation mark associated with pixel alignment. The disadvantage of this method is the high requirements for computing resources.

Опубликованная международная патентная заявка WO 2008/041167 [3] предлагает способ вычисления диспарантности, основанный на использовании итерационной фильтрации начальной карты диспарантности, которая представлена изображением шума, то есть изображением, чьи пиксели беспорядочно заданы в диапазоне от минимальных до максимальных величин. На каждой итерации текущая оценка диспарантности уточняется путем фильтрации в соответствии с опорными стереоизображениями. Преимуществом способа является относительно быстрое достижение приемлемого качества карты глубины, в которой границы объектов хорошо прорабатываются на основе цвета объектов. Недостаток способа - неадекватное присваивание значений глубины по цвету внутри объектов. Более яркие объекты или части объектов на цветном изображении кажутся расположенными ближе, а у более темных объектов значение глубины будет больше. Кроме того, чтобы достигнуть хороших результатов требуется применять фильтр с большим размером ядра, что вызывает необходимость в значительных вычислительных ресурсах.Published international patent application WO 2008/041167 [3] provides a method for calculating disparity based on iterative filtering of the initial disparity map, which is represented by a noise image, that is, an image whose pixels are randomly set in the range from minimum to maximum values. At each iteration, the current disparity score is refined by filtering in accordance with the reference stereo images. The advantage of the method is the relatively rapid achievement of an acceptable quality of the depth map, in which the boundaries of the objects are well developed based on the color of the objects. The disadvantage of this method is the inadequate assignment of depth values by color inside objects. Brighter objects or parts of objects in the color image appear closer, and darker objects will have a deeper depth. In addition, in order to achieve good results, it is required to apply a filter with a large kernel size, which necessitates significant computational resources.

В российской патентной заявке №2008144840 [5] идея из [3] получила дальнейшее развитие. Чтобы снизить тенденцию приписывания значений глубины только в соответствии с цветами изображения, необработанная оценка диспарантности вычислялась стандартным способом, основанным на пооконном совмещении стереопар. Затем была применена схема фильтрации, основанная на цветовой информации от стереопары. Чтобы сократить число неправильных значений глубины, был применен принцип ограничения градиента [6]. Чтобы понизить вычислительную нагрузку, была исследована адаптация радиуса фильтра в зависимости от номера итерации. Для большого числа итераций, например свыше шести, алгоритм работает приблизительно на 40% быстрее при улучшенных качественных характеристиках.In Russian patent application No. 2008144840 [5], the idea from [3] was further developed. To reduce the tendency to assign depth values only in accordance with the image colors, the raw disparity estimate was calculated in a standard way based on window-wise matching of stereo pairs. Then a filtering scheme was applied based on color information from a stereo pair. To reduce the number of incorrect depth values, the gradient limitation principle was applied [6]. To reduce the computational load, we studied the adaptation of the filter radius depending on the iteration number. For a large number of iterations, for example, over six, the algorithm runs approximately 40% faster with improved quality characteristics.

В российской патентной заявке №2008140111 [7] описан способ быстрого улучшения необработанной карты диспарантности, которая получается в результате попиксельного совмещения в окнах поддержки. Идея способа заключается в том, чтобы обнаружить в необработанной карте диспарантности "плохие пиксели", то есть пиксели, содержащие ошибочные данные о глубине. Эти пиксели обычно располагаются в окклюзионных и низкотекстурных областях изображения. После выявления таких областей предложенный способ предусматривает распространение корректных значений глубины на эти области путем фильтрации по цвету изображения. В этом способе используется лишь одно цветное изображение, которое может дать отличный результат в случае, если число плохих пикселей в необработанной карте диспарантности не превышает 30%.Russian patent application No. 2008140111 [7] describes a method for rapidly improving an unprocessed disparity card that results from pixel-by-pixel matching in the support windows. The idea of the method is to detect “bad pixels” in the raw disparity map, that is, pixels containing erroneous depth data. These pixels are usually located in the occlusal and low-texture areas of the image. After identifying such areas, the proposed method involves distributing the correct depth values to these areas by filtering by image color. This method uses only one color image, which can give an excellent result if the number of bad pixels in the raw disparity card does not exceed 30%.

Патентная заявка RU 2009110511 [8] описывает систему для захвата трехмерных изображений и воспроизведения их на автостереоскопическом дисплее. Главными компонентами системы являются блок захвата изображения, который получает изображения от стерео- или мультикамер; блок оценки диспарантности, который вычисляет диспарантность между смежными видами; блок синтеза видов, который генерирует несколько видов в соответствии с характеристиками трехмерного дисплея блока трехмерного воспроизведения. Соответствующие способы оценки глубины и синтеза видов применяются таким образом, чтобы обеспечивать возможность реализации на высокопроизводительных вычислительных устройствах, таких как GPU или FPGA.Patent application RU 2009110511 [8] describes a system for capturing three-dimensional images and reproducing them on an autostereoscopic display. The main components of the system are the image capture unit, which receives images from stereo or multicamera; a disparity assessment unit that calculates disparity between adjacent views; a view synthesis unit that generates several views in accordance with the characteristics of the three-dimensional display of the three-dimensional reproduction unit. Appropriate methods for assessing the depth and synthesis of species are applied in such a way as to enable implementation on high-performance computing devices such as GPUs or FPGAs.

Недавно опубликованная американская патентная заявка №2009/0129667 [9] описывает устройство и способ для оценки карты глубины, генерации промежуточного изображения и кодирования множественных видеоизображений. Оценка диспарантности осуществляется в два этапа. Сначала вычисляют необработанную карту диспарантности, а затем применяют способ «распространения правдоподобия» (Belief Propagation или сокращенно -ВР) для улучшения карты глубины. Применение способов ВР обеспечивает наилучшие результаты при решении задачи по оценке диспарантности. Недостатки таких способов выражаются в очень высокой вычислительной сложности и повышенным требованиям к ресурсам памяти. Поэтому они обычно реализуются в качестве программы для персонального компьютера с обработкой данных по мультивидам в режиме off-line. Для генерации промежуточных изображений используют известную из [10] технологию "depth image based rendering» (DIBR). Для того чтобы кодировать мультивидовые изображения, применяют сходный со стандартом MPEG способ сжатия видеосигнала, основанный на поблочном дискретном косинусном преобразовании (discrete cosine transformation или DCT) с последующим кодированием энтропии.The recently published US patent application No. 2009/0129667 [9] describes a device and method for evaluating a depth map, generating an intermediate image, and encoding multiple video images. The disparity assessment is carried out in two stages. First, an untreated disparity map is computed, and then the Belief Propagation (abbreviated BP) method is used to improve the depth map. The use of BP methods provides the best results in solving the disparity assessment problem. The disadvantages of such methods are expressed in very high computational complexity and increased requirements for memory resources. Therefore, they are usually implemented as a program for a personal computer with multi-view data processing in off-line mode. For the generation of intermediate images, the “depth image based rendering” (DIBR) technology, known from [10], is used. In order to encode multiview images, a method similar to the MPEG standard is used for video compression based on a discrete cosine transformation or DCT followed by encoding of entropy.

Наиболее близкими к заявляемому изобретению признаками обладает техническое решение, описанное в опубликованной заявке на патент США №2008/0240549 [11], в которой предложен способ управления глубины трехмерного эффекта при воспроизведении стереоизображений или мультивидовых изображений с помощью оценки диспарантности соответствующих стереоизображений с настройкой глубины трехмерного эффекта, основанном на гистограмме диспарантности, и перегруппировкой стереоизображений. Степень управления глубиной определяется как результат свертки гистограммы диспарантности с характеристической функцией. Описаны два типа характеристических функций: одна из них предназначена лишь для сцен, содержащих только задний фон, а другая - для видео с ярко выраженным объектом переднего плана и фоном. Степень перегруппировки стереовидов изображения зависит от суммы свертки характеристической функции с гистограммой диспарантности. Упомянутая патентная заявка выбрана как прототип заявляемого изобретения.The closest to the claimed invention features the technical solution described in published application for US patent No. 2008/0240549 [11], which proposes a method of controlling the depth of the three-dimensional effect when playing stereo images or multiview images by evaluating the disparity of the corresponding stereo images with setting the depth of the three-dimensional effect based on the histogram of disparity, and rearrangement of stereo images. The degree of depth control is determined as the result of convolution of the disparity histogram with the characteristic function. Two types of characteristic functions are described: one of them is intended only for scenes containing only a background, and the other is for video with a pronounced foreground object and background. The degree of rearrangement of the image stereo-image depends on the sum of the convolution of the characteristic function with the disparity histogram. Said patent application is selected as a prototype of the claimed invention.

Недостаток прототипа состоит в следующем. В способе-прототипе не приведено никакого объяснения относительно того, как классифицировать входное видео в "фоновом" режиме или в режиме "передний план + фон". Кроме того, неясно, каким способом оценивается диспарантность. Вычисление карты диспарантности является очень сложной задачей, поэтому способ вычисления карты диспарантности должен быть четко определен, чтобы была возможность применять его в аппаратных средствах (HW), обеспечивая в то же время качество не ниже, чем в известных решениях. Помимо этого для реализации в аппаратных средствах очень важно использовать как можно меньше ресурсов памяти, чтобы снизить стоимость устройства. Таким образом, есть необходимость в разработке способа управления глубиной, основанного на высококачественной генерации глубины с низкими требованиями к объему памяти и числу вычислений.The disadvantage of the prototype is as follows. In the prototype method there is no explanation as to how to classify the input video in the "background" mode or in the "foreground + background" mode. In addition, it is not clear how disparity is assessed. Calculation of the disparity card is a very difficult task, therefore, the method of calculating the disparity card must be clearly defined so that it can be used in hardware (HW), while ensuring quality not lower than in known solutions. In addition, it is very important to use as few memory resources as possible for the hardware implementation to reduce the cost of the device. Thus, there is a need to develop a depth control method based on high-quality depth generation with low requirements for memory size and the number of calculations.

Задача, которая решается заявляемым изобретением, состоит в создании устройства и способа преобразования стереоконтента для снижения усталости глаз при просмотре трехмерного видео, иными словами, для обеспечения уверенного управления восприятием глубины во время демонстрации на трехмерном телевизионном приемнике. Это достигается за счет генерации высококачественного стереоизображения на основе вычисленной карты диспарантности /глубины.The problem that is solved by the claimed invention is to create a device and method for converting stereo content to reduce eye fatigue when watching a three-dimensional video, in other words, to provide reliable control of depth perception during a demonstration on a three-dimensional television receiver. This is achieved by generating high-quality stereo images based on the calculated disparity / depth map.

Технический результат достигается за счет разработки способа и устройства управления глубиной для воспроизведения стереоконтента в трехмерном телевидении, при этом способ предусматривает выполнение следующих операций:The technical result is achieved by developing a method and depth control device for reproducing stereo content in three-dimensional television, the method provides for the following operations:

- выполняют вычисление исходной карты диспарантности/глубины для стереоизображения из трехмерного видео;- perform the calculation of the original map disparity / depth for stereo images from three-dimensional video;

- выполняют сглаживание карты глубины;- perform smoothing of the depth map;

- изменяют параметры восприятия глубины в соответствии с оценкой усталости глаз;- change the parameters of depth perception in accordance with the assessment of eye fatigue;

- генерируют новое стереоизображение в соответствии с параметрами восприятия глубины.- generate a new stereo image in accordance with the parameters of the perception of depth.

В части практической реализации заявляемого способа разработана система преобразования стереоконтента для снижения усталости глаз при просмотре трехмерного видео, включающая в себя блок вычисления и сглаживания карты глубины, блок управления глубиной и блок визуализации, при этом первый выход блока вычисления и сглаживания карты глубины подключен к первому входу блока визуализации, второй выход блока вычисления и сглаживания карты глубины подключен к входу блока управления глубиной, выход которого подключен ко второму входу блока визуализации.In terms of practical implementation of the proposed method, a stereo content conversion system has been developed to reduce eye fatigue when watching a three-dimensional video, including a depth map calculation and smoothing unit, a depth control unit and a visualization unit, while the first output of the depth map calculation and smoothing unit is connected to the first input of the visualization unit, the second output of the depth calculation and smoothing unit is connected to the input of the depth control unit, the output of which is connected to the second input of the unit in zualizatsii.

Вычисление глубины на основе стереоконтента является трудной задачей, особенно для поверхностей с однородными областями (нетекстурированными областями), участков с неоднородной глубиной, на окклюзионных участках и на участках с повторяющимся рисунком (шаблоном), что приводит к неоднозначному решению. Это означает, что только немногие из присвоенных значений глубины являются достоверными и однозначными. Некоторые значения глубины, например для окклюзионных (т.е. загороженных) областей, не поддаются вычислению через совмещение, так как эти области видны лишь на одном изображении. При этом процедура синтеза высококачественного виртуального вида требует 1) плотной карты глубины; 2) точных границ глубины, которые строго совпадают с границами объекта; 3) выровненных значений глубины в пределах объекта.Calculating depth based on stereo content is a difficult task, especially for surfaces with homogeneous areas (non-textured areas), areas with a heterogeneous depth, in occlusal areas and in areas with a repeating pattern (pattern), which leads to an ambiguous solution. This means that only a few of the assigned depth values are reliable and unambiguous. Some depth values, for example, for occlusal (i.e. enclosed) areas, cannot be calculated by combining, since these areas are visible in only one image. Moreover, the synthesis procedure for a high-quality virtual view requires 1) a dense depth map; 2) the exact boundaries of the depth, which strictly coincide with the boundaries of the object; 3) aligned depths within an object.

В связи с этим, необходимо выявить и исправить неоднозначные значения глубины таким образом, чтобы процедура синтеза виртуального вида не порождала видимых артефактов и обеспечивала максимальное приближение к реальной глубине. Известные из уровня техники решения в основном прибегали к использованию различных стратегий оптимизации, таких как динамическое программирование, сечение графов, совмещение стереопар путем сегментации и т.д. Однако такие решения требуют очень высоких вычислительных ресурсов и не позволяют сгенерировать гладкую карту глубины, пригодную для синтеза видов, свободных от артефактов.In this regard, it is necessary to identify and correct ambiguous depth values in such a way that the synthesis procedure of the virtual form does not generate visible artifacts and ensures maximum approximation to the real depth. The prior art solutions mainly resorted to the use of various optimization strategies, such as dynamic programming, graph cross-section, stereo pairing by segmentation, etc. However, such solutions require very high computational resources and do not allow generating a smooth depth map suitable for the synthesis of species free of artifacts.

В предложенном способе усилия направлены на быстрое улучшение исходной карты глубины в локальном окне, а не на использование глобального способа оптимизации для вычисления диспарантности. Исходная карта глубины может быть получена стандартными способами локального совмещения стереовидов. Обычно такой вид глубины является очень зашумленным, особенно в областях со слабо выраженной текстурой и в области окклюзии. Основная идея данного способа состоит в том, чтобы использовать взвешенный усредненный фильтр для сглаживания и улучшения начальной карты глубины на основе опорных цветных изображений и достоверных пикселей глубины. При этом предполагается, что у пикселей со сходным цветом в некоторых окрестностях значения глубины также являются сходными. Поэтому достоверные значения глубины могут присваиваться неопределенным пикселям на основе сходства цвета и соседства на опорных цветных изображениях. Кроме того, такое фильтрация уточняет пиксели с достоверной глубиной и формирует плотную и гладкую карту глубины, отвечающую вышеприведенным требованиям к картам глубины.In the proposed method, efforts are aimed at rapidly improving the initial depth map in the local window, and not at using the global optimization method for calculating the disparity. The initial depth map can be obtained by standard methods of local stereo-alignment. Usually this kind of depth is very noisy, especially in areas with a weak texture and in the area of occlusion. The main idea of this method is to use a weighted average filter to smooth and improve the initial depth map based on reference color images and reliable depth pixels. In this case, it is assumed that for pixels with a similar color in some neighborhoods, the depth values are also similar. Therefore, reliable depth values can be assigned to undefined pixels based on color similarity and proximity in the reference color images. In addition, such filtering refines the pixels with reliable depth and forms a dense and smooth depth map that meets the above requirements for depth maps.

Предлагаемый способ использует несколько технологий определения того, является ли текущий пиксель аномальным (ненадежным) или нет. Ненадежные пиксели маркируются некоторыми значениями маски для исключения их из процесса фильтрации. Различные технологии могут использоваться для оценки надежности пикселя. В предлагаемом способе применяют перекрестную проверку значений глубины слева и справа. Другими словами, если разность значений глубины слева и справа для соответствующих точек меньше, чем пороговое значение, то значения глубины рассматривают как достоверные. В противном случае их маркируют как аномальные и удаляют из процесса сглаживания. Однако для успешной обработки аномальных пикселей требуются фильтры с большим ядром в случаях окклюзии объекта или зашумленности карты глубины. Предлагаемый способ основан на рекурсивной реализации для уменьшения размера ядра фильтра. Рекурсивная реализация означает, что результат фильтрации записывают в исходный буфер. Это также приводит к более быстрой сходимости алгоритма при меньшем числе итераций.The proposed method uses several technologies for determining whether the current pixel is abnormal (unreliable) or not. Unreliable pixels are marked with some mask values to exclude them from the filtering process. Various technologies may be used to evaluate pixel reliability. In the proposed method, cross-checking the depth values on the left and on the right is used. In other words, if the difference between the depth values on the left and on the right for the corresponding points is less than the threshold value, then the depth values are considered reliable. Otherwise, they are marked as abnormal and removed from the smoothing process. However, for successful processing of anomalous pixels, filters with a large core are required in cases of occlusion of an object or noisy depth map. The proposed method is based on a recursive implementation to reduce the size of the filter core. A recursive implementation means that the filter result is written to the source buffer. This also leads to faster convergence of the algorithm with fewer iterations.

В предлагаемом способе была проанализирована идея выявления аномальных пикселей в карте глубины. Чтобы устранить зашумленность необработанной карты глубины, был применен анализ гистограммы. Значения зашумленности карты глубины представлены волнами на низких и высоких границах гистограммы (Фиг.7, вид 7.3). Для изъятия аномальных пикселей была применена обрезка гистограммы на ее границах. Способ обрезки гистограммы является пригодным для реализации в аппаратных средствах, поскольку он использует локальные гистограммы, построенные на основе информации, хранящейся на линиях памяти, и поэтому нет необходимости работать со всем изображением.In the proposed method, the idea of detecting abnormal pixels in a depth map was analyzed. To eliminate the noisiness of the raw depth map, histogram analysis was applied. The noise figures of the depth map are represented by waves at the low and high borders of the histogram (Fig. 7, view 7.3). To remove the anomalous pixels, trimming the histogram at its borders was used. The method of trimming a histogram is suitable for implementation in hardware, since it uses local histograms based on information stored on memory lines, and therefore there is no need to work with the entire image.

Для эффективного устранения шума исходной карты глубины в областях со слабо выраженной текстурой алгоритм сглаживания глубины обрабатывает такие области с более сильными установками сглаживающего фильтра. С этой целью формируют двоичную маску текстурированных и слабо текстурированных областей соответствующего цветного изображения, используя специальный градиентный фильтр. Фильтр пригоден для реализации на аппаратных средствах, так как он вычисляет четыре типа градиентов в локальном окне.To effectively eliminate the noise of the original depth map in areas with a weak texture, the depth smoothing algorithm processes such areas with stronger settings of the smoothing filter. For this purpose, a binary mask of textured and weakly textured areas of the corresponding color image is formed using a special gradient filter. The filter is suitable for implementation on hardware, as it calculates four types of gradients in the local window.

Способ, описанный в данном изобретении, позволяет генерировать высококачественную карту глубины, обеспечивая синтез вида с заданными параметрами восприятия глубины. Фиг.13 демонстрирует некоторые результаты применения предлагаемого способа. Эти результаты получены с использованием семи линий памяти при трех итерациях алгоритма. Из приведенных примеров видно, что предлагаемый способ обладает возможностью улучшать даже очень зашумленную карту глубины всего за несколько итераций при работе в локальном окне.The method described in this invention allows to generate a high-quality depth map, providing a synthesis of the view with the specified depth perception parameters. 13 shows some results of applying the proposed method. These results were obtained using seven memory lines at three iterations of the algorithm. From the above examples it can be seen that the proposed method has the ability to improve even a very noisy depth map in just a few iterations when working in a local window.

Далее существо заявляемого изобретения поясняется с привлечением графических материалов.Further, the essence of the claimed invention is illustrated with the use of graphic materials.

Фиг.1. Блок-схема системы для преобразования стереоконтента в целях снижения усталости глаз при просмотре трехмерного видео согласно изобретению.Figure 1. A block diagram of a system for converting stereo content in order to reduce eye fatigue when watching a three-dimensional video according to the invention.

Фиг.2. Блок-схема поэтапного осуществления способа преобразования стереоконтента в целях снижения усталости глаз при просмотре трехмерного видео согласно изобретению.Figure 2. A block diagram of a phased implementation of the method of converting stereo content in order to reduce eye fatigue when watching a three-dimensional video according to the invention.

Фиг.3. Блок-схема системы для вычисления карты глубины и блока сглаживания согласно изобретению.Figure 3. A block diagram of a system for calculating a depth map and a smoothing block according to the invention.

Фиг.4. Блок-схема поэтапного осуществления способа сглаживания карты глубины, основанного на рекурсивной фильтрации, согласно изобретению.Figure 4. A flowchart of a phased implementation of a method for smoothing a depth map based on recursive filtering according to the invention.

Фиг.5. Примеры различной ориентации стереокадра.Figure 5. Examples of different stereo frame orientations.

Фиг.6. Иллюстрация гистограммы глубины, где пять процентов самых темных и пять процентов самых ярких областей гистограммы окрашены в черный цвет.6. An illustration of a depth histogram where five percent of the darkest and five percent of the brightest areas of the histogram are black.

Фиг.7. Пример обрезки гистограммы глубины: вид 7.1 - цветное изображение, вид 7.2 - соответствующая карта глубины, вид 7.3 - гистограмма глубины с обрезкой пороговых значений для шести процентов самых темных и трех процентов самых ярких пикселей.7. Example of cropping a depth histogram: view 7.1 - color image, view 7.2 - corresponding depth map, view 7.3 - depth histogram with cropping threshold values for six percent of the darkest and three percent of the brightest pixels.

Фиг.8. Блок-схема способа перекрестной проверки глубины.Fig. 8. Flow chart of cross depth checking method.

Фиг.9. Пример перекрестной проверки глубины: вид 9.1 - левое изображение с глубиной, вид 9.2 - правое изображение с глубиной, вид 9.3 - глубина левой стороны с аномальными пикселями, маркированными черным цветом.Fig.9. Example of cross-checking depth: view 9.1 - left image with depth, view 9.2 - right image with depth, view 9.3 - depth of the left side with anomalous pixels marked in black.

Фиг.10. Пример двоичной сегментации изображения на участки с ярко выраженной текстурой и на участки со слабо выраженной текстурой: вид 10.1 - цветное изображение, вид 10.2 - соответствующая карта глубины, 10.3 - маска двоичной сегментации (черным цветом отмечены участки с ярко выраженной текстурой, белый цвет означает участки со слабо выраженной текстурой).Figure 10. An example of binary image segmentation into areas with a pronounced texture and areas with a weakly expressed texture: view 10.1 - color image, view 10.2 - corresponding depth map, 10.3 - binary segmentation mask (areas with a pronounced texture are marked in black, white means areas with a weak texture).

Фиг.11. Блок-схема способа для фильтрации глубины.11. Flowchart of a method for filtering depth.

Фиг.12. Иллюстрация принципа фильтрации глубины: вид 12.1 - опорное цветное изображение, вид 12.2 - совмещаемое цветное изображение, вид 12.3 - опорная карта глубины.Fig. 12. Illustration of the principle of depth filtering: view 12.1 - reference color image, view 12.2 - compatible color image, view 12.3 - reference depth map.

Фиг.13. Результаты сглаживания глубины, согласно изобретению: вид 13.1 - сверху вниз: исходная карта глубины, соответствующее цветное изображение, карта глубины, сглаженная заявляемым способом, вид 13.2 - сверху вниз: исходная карта глубины, соответствующее цветное изображение, карта глубины, сглаженная заявляемым способом.Fig.13. The results of smoothing the depth according to the invention: top view 13.1: from top to bottom: original depth map, corresponding color image, depth map smoothed by the claimed method, view 13.2 from top to bottom: original depth map, corresponding color image, depth map smoothed by the claimed method.

Здесь и далее со ссылкой на приложенные чертежи детально описывается предпочтительный вариант реализации заявляемого изобретения. Однако объем охраны заявляемого изобретения не ограничивается предпочтительным вариантом реализации и покрывает реализации в различных вариантах. Предпочтительный вариант реализации, приведенный в описании, является лишь примером, представленным для того, чтобы раскрыть сущность изобретения и помочь специалистам полностью понять заявляемое изобретение.Hereinafter, with reference to the attached drawings, a preferred embodiment of the claimed invention is described in detail. However, the scope of protection of the claimed invention is not limited to the preferred embodiment and covers the implementation in various embodiments. The preferred implementation described in the description is only an example presented in order to disclose the essence of the invention and to help specialists fully understand the claimed invention.

Фиг.1 - блок-схема, иллюстрирующая систему преобразования стереоконтента в целях снижения усталости глаз при просмотре трехмерного видео согласно заявляемому изобретению. Представленная на Фиг.1 система включает в себя следующие блоки, подключенные один к другому: блок 102 вычисления и сглаживания карты глубины, блок 103 управления глубиной и блок 104 визуализации. Блок 102 вычисления и сглаживания карты глубины предназначен для вычисления карты глубины на основании стереоизображения 101. В сущности, карта глубины требуется для генерации нового стереоизображения 105 блоком 104 визуализации в соответствии с параметрами восприятия глубины, заданными блоком 103 управления глубиной. Блок 102 вычисления и сглаживания карты глубины описан более детально в нижеследующих абзацах.Figure 1 is a block diagram illustrating a stereo content conversion system in order to reduce eye fatigue when watching a three-dimensional video according to the claimed invention. The system shown in FIG. 1 includes the following blocks connected to one another: a depth map calculation and smoothing unit 102, a depth control unit 103, and a visualization unit 104. The depth map calculation and smoothing unit 102 is designed to calculate the depth map based on the stereo image 101. In essence, a depth map is required to generate a new stereo image 105 by the visualization unit 104 in accordance with the depth sensing parameters specified by the depth control unit 103. The depth map calculation and smoothing unit 102 is described in more detail in the following paragraphs.

На Фиг.2 представлен процесс (способ) преобразования стереоконтента для снижения усталости глаз при просмотре трехмерного видео согласно заявляемому изобретению. Рассмотрим поэтапную реализацию способа преобразования стереоконтента для снижения усталости глаз при просмотре трехмерного видео. Первый этап (Шаг 201) представляет собой вычисление исходной карты глубины. Вычисление исходной карты глубины выполняют, используя стандартные способы локального совмещения стереовидов, известные из уровня техники. После того как необработанная карта глубины была вычислена на этапе вычисления исходной карты глубины, способ управления глубиной переходит к этапу сглаживания карты глубины (Шаг 202). Цель этого этапа состоит в том, чтобы удалить аномальные пиксели из необработанной карты глубины и сделать ее пригодной для представления вида в соответствии с параметрами восприятия глубины. Способ сглаживания карты глубины будет детально рассмотрен в следующем подразделе. Следующий этап способа управления глубиной состоит в настройке восприятия глубины наблюдаемого трехмерного телевизионного контента (Шаг 203). Это выполняется путем изменения положения изображений для левого и правого глаза. В приведенном примере реализации заявляемого изобретения это обеспечивается тем, что восприятием глубины управляет параметр D, который изменяется от 0 до 1. Параметр D соответствует положению правого вида. Значение 1 соответствует конфигурации входного стереовида, в то время как значение 0 описывает случай монокулярного представления, когда изображения для левого глаза и для правого глаза совпадают в пространстве. Приемлемое значение при настройке этого параметра лежит в диапазоне от 0,1 до 1. На следующем этапе (Шаг 204) способа управления глубиной формируют новый вид для правого глаза на основе значения параметра D. Новый вид для правого глаза может быть синтезирован путем интерполяции, основанной на карте диспарантности [12], поскольку карта глубины, вычисленная на Шаге 203, показывает соответствие пикселей между исходными изображениями для левого и правого глаза. Исходное изображение для левого глаза совместно с новым изображением для правого глаза формирует модифицированное стереоизображение, которое имеет уменьшенный параллакс по сравнению с первоначальным стереоизображением. Генерируемое стереоизображение с уменьшенным параллаксом обеспечивает снижение усталости глаз во время длительного просмотра трехмерного телевидения.Figure 2 presents the process (method) of converting stereo content to reduce eye fatigue when watching a three-dimensional video according to the claimed invention. Consider a phased implementation of the method of converting stereo content to reduce eye fatigue when watching a three-dimensional video. The first step (Step 201) is the calculation of the original depth map. The calculation of the initial depth map is performed using standard methods of local stereo-alignment, known from the prior art. After the raw depth map has been calculated in the step of calculating the original depth map, the depth control method proceeds to the step of smoothing the depth map (Step 202). The purpose of this step is to remove the abnormal pixels from the raw depth map and make it suitable for representing the view in accordance with the depth perception parameters. The method of smoothing the depth map will be discussed in detail in the next subsection. The next step in the depth control method is to adjust the depth perception of the observed three-dimensional television content (Step 203). This is done by repositioning the images for the left and right eye. In the example implementation of the claimed invention, this is ensured by the fact that the perception of depth is controlled by the parameter D, which varies from 0 to 1. The parameter D corresponds to the position of the right kind. The value 1 corresponds to the configuration of the input stereo view, while the value 0 describes the case of the monocular representation when the images for the left eye and for the right eye coincide in space. An acceptable value when setting this parameter lies in the range from 0.1 to 1. In the next step (Step 204) of the depth control method, a new view for the right eye is formed based on the value of the parameter D. A new view for the right eye can be synthesized by interpolation based on on the disparity map [12], since the depth map computed in Step 203 shows the correspondence of pixels between the source images for the left and right eyes. The original image for the left eye, together with the new image for the right eye, forms a modified stereo image that has a reduced parallax compared to the original stereo image. The generated stereo image with reduced parallax reduces eye fatigue during long-term viewing of three-dimensional television.

Фиг.3 представляет собой блок-схему, иллюстрирующую структуру системы для сглаживания карты глубины на основе рекурсивной фильтрации согласно заявляемому изобретению. Система для сглаживания карты глубины, представленная на Фиг.3, включает в себя следующие блоки, подключенные один к другому, блок 320 предварительной обработки, блок 330 вычисления исходной карты глубины, блок 340 сглаживания карты глубины и блок 350 временной фильтрации. В качестве входящих данных в системе сглаживания используют стереоизображение 310. Стереоизображение может быть представлено в виде отдельного изображения или составлено на основе видеокадров, полученных от стереокамеры. В случае съемки несколькими камерами пара изображений от выбранных камер может формировать стереоизображение. После проведения необходимых вычислений система сглаживания карты глубины генерирует плотную карту 307 глубины для выбранных видов.Figure 3 is a block diagram illustrating the structure of a system for smoothing a depth map based on recursive filtering according to the claimed invention. The system for smoothing a depth map shown in FIG. 3 includes the following blocks connected to each other, a preprocessing unit 320, an initial depth map calculating unit 330, a depth map smoothing unit 340, and a time filtering unit 350. A stereo image 310 is used as input to the smoothing system. The stereo image can be presented as a separate image or composed on the basis of video frames received from a stereo camera. In the case of shooting with multiple cameras, a pair of images from the selected cameras can form a stereo image. After performing the necessary calculations, the smoothing system of the depth map generates a dense depth map 307 for the selected species.

Блок 320 предварительной обработки предназначен для подготовки стереоизображения 310 с целью последующей эффективной обработки в блоке 330 вычисления исходной карты глубины и в блоке 340 сглаживания карты глубины. Блок 320 предварительной обработки включает в себя блок 321 предварительной обработки стереоизображения и блок 322 сегментации опорного изображения. Блок 321 предварительной обработки стереоизображения имеет две главные функции. Во-первых, он выделяет из исходного стереоизображения отдельные изображения, соответствующие каждому виду. Изображения подразделяются на опорные и совмещаемые. Опорное изображение 360 является изображением, сформированным из стереопары, для которой карта глубины будет сглажена. А совмещаемое изображение 370 является другим изображением стереопары. Соответственно, опорная карта глубины является картой глубины, относящейся к опорному изображению, в то время как совмещаемая карта глубины является картой совмещаемого изображения.The pre-processing unit 320 is intended for preparing a stereo image 310 for the purpose of subsequent efficient processing in the initial depth map calculation unit 330 and in the depth map smoothing unit 340. The preprocessing unit 320 includes a stereo image preprocessing unit 321 and a reference image segmentation unit 322. The stereo imaging unit 321 has two main functions. Firstly, he selects from the original stereo image individual images corresponding to each type. Images are divided into reference and combined. The reference image 360 is an image formed from a stereo pair for which the depth map will be smoothed. And the combined image 370 is another image of a stereo pair. Accordingly, the reference depth map is a depth map related to the reference image, while the overlapping depth map is a compatible image map.

Входной видео поток может быть закодирован в различных форматах. Наиболее распространенные форматы основаны на использовании лево-правой ориентации, ориентации верх-низ, формата шахматной доски и лево-правой ориентации с разделением кадров во временной области. Примеры лево-правой ориентации (501) и ориентации верх-низ (502) приведены на Фиг.5. Для лучшего вычисления карты глубины исходные цветные изображения следует обрабатывать пространственным фильтром, устраняющим зашумленность. Для этой цели можно использовать фильтр Гаусса. Однако и любой другой фильтр применим для этой задачи. Эти функциональные возможности реализованы в блоке 321. Блок 322 сегментации опорного изображения 322 генерирует опорную двоичную маску 302. Эта двоичная маска соответствует сегментации изображения на участках с ярко выраженной и слабо выраженной текстурой. Пиксели двоичной маски проиндексированы как единица, если участок рассматривают как область со слабо выраженной текстурой. В противном случае пиксели маски индексируются как ноль, если участок считается имеющим ярко выраженную структуру. Для установления степени выраженности текстуры используют градиентный фильтр в локальном окне.The input video stream can be encoded in various formats. The most common formats are based on the use of left-right orientation, top-bottom orientation, checkerboard format and left-right orientation with the division of frames in the time domain. Examples of left-right orientation (501) and top-bottom orientation (502) are shown in FIG. 5. To better calculate the depth map, the original color images should be processed with a spatial filter that eliminates noise. For this purpose, you can use the Gaussian filter. However, any other filter is applicable for this task. This functionality is implemented in block 321. The segmentation block 322 of the reference image 322 generates a reference binary mask 302. This binary mask corresponds to image segmentation in areas with a pronounced and weakly pronounced texture. The pixels of the binary mask are indexed as a unit if the area is considered as a region with a weak texture. Otherwise, the pixels of the mask are indexed as zero if the area is considered to have a pronounced structure. To establish the severity of the texture, use a gradient filter in the local window.

Функция блока 330 вычисления исходной карты глубины заключается в том, чтобы сделать приблизительное вычисление карты глубины, используя стандартные способы локального совмещения [4]. Это реализовано в блоке 331 вычисления опорной карты глубины и в блоке 332 вычисления совмещаемой карты глубины. Другие функциональные возможности блока вычисления исходной карты глубины касаются выявления аномальных пикселей на приблизительной карте глубины. Обрезка гистограммы карты глубины осуществляется в блоке анализа гистограммы опорной карты глубины, в то время как перекрестная проверка карты глубины выполняется проверяющим блоком 334. На выходе блока 330 формируются опорная и совмещаемая карты глубины с маркированными аномальными пикселями 305.The function of the block 330 calculating the original depth map is to make an approximate calculation of the depth map using standard methods of local alignment [4]. This is implemented in block 331 for calculating a reference depth map and in block 332 for calculating a compatible depth map. Other functionality of the source depth map calculation unit relates to detecting abnormal pixels on an approximate depth map. Trimming the histogram of the depth map is performed in the histogram analysis section of the reference depth map, while the cross-checking of the depth map is performed by the checking block 334. At the output of block 330, the reference and compatible depth maps with marked anomalous pixels 305 are formed.

Блок 340 сглаживания карты глубины предназначен для улучшения карты глубины с помощью рекурсивной фильтрации необработанных карт 305 (совмещаемой и опорной) глубины. Карта глубины подвергается фильтрации в блоке 342 фильтрации глубины. Число необходимых итераций задается блоком 341 управления итерациями. В процессе каждой итерации блок 341 вычисляет критерии сходимости для процесса фильтрации. В предпочтительной варианте реализации заявляемого изобретения описаны два варианта реализации критериев сходимости. Первый вариант критерия сходимости вычисляет остаточное изображение между смежными вычислениями карты диспарантности. Сумма остаточных пикселей не должна превышать порога сходимости T_dec1 вычисления карты диспарантности. Критерий сходимости может быть сформулирован и другим способом, как число итераций фильтрации карты глубины. Если число итераций превышает порог Т_dec2 сходимости вычислений карты диспарантности, то процесс фильтрации прекращают.The depth map smoothing unit 340 is designed to improve the depth map by recursively filtering the raw depth maps 305 (compatible and reference). The depth map is filtered in block 342 depth filtering. The number of iterations required is set by the iteration control block 341. During each iteration, block 341 calculates convergence criteria for the filtering process. In a preferred embodiment of the invention, two embodiments of the convergence criteria are described. The first version of the convergence criterion calculates the afterimage between adjacent calculations of the disparity map. The sum of residual pixels must not exceed the convergence threshold T _{dec1 of the disparity} map calculation. The convergence criterion can be formulated in another way, as the number of iterations of filtering the depth map. If the number of iterations exceeds the threshold T _{dec2 of the} convergence of the disparity map computations, then the filtering process is stopped.

Блок 343 пост-обработки предназначен для окончательного уточнения вычисленных карт глубины. В предпочтительном варианте реализации заявляемого изобретения блок 343 постобработки состоит из блока медианного фильтра. Медианная фильтрация известна из уровня техники, поэтому описания этого блока здесь не приводится. Другие виды фильтров для улучшения качества изображения тоже могут применяться в блоке 343, но это не изменяет назначение блока. В конце концов, необходимо получить сглаженные карты 306 глубины (опорную и совмещаемую).The post-processing block 343 is intended for the final refinement of the calculated depth maps. In a preferred embodiment of the claimed invention, the post-processing unit 343 consists of a median filter unit. Median filtering is known in the art, so this block is not described here. Other types of filters to improve image quality can also be applied in block 343, but this does not change the purpose of the block. In the end, you need to get smoothed depth maps 306 (reference and compatible).

Блок 350 временного фильтра предназначен для фильтрации карты глубины по времени. Блок 350 временного фильтра состоит из буфера кадров 351, который сохраняет некоторое число кадров глубины с соответствующими цветными изображениями, и блока 352 темпорального фильтра, который выполняет межкадровую фильтрацию карты глубины, используя информацию от соответствующих цветных изображений.Block 350 time filter is designed to filter the depth map by time. The temporal filter block 350 consists of a frame buffer 351 that stores a number of depth frames with corresponding color images, and a temporal filter block 352 that performs interframe filtering of the depth map using information from the corresponding color images.

Фиг.4 иллюстрирует в деталях заявляемый способ сглаживания карты глубины, основанный на рекурсивной фильтрации. Рассмотрим поэтапную реализацию способа сглаживания глубины, основанного на рекурсивной фильтрации. Первый этап (Шаг 401) состоит в предварительной обработке цветных изображений. Предварительная обработка заключается в фильтрации цветных изображений фильтром Гаусса в малом окне, например, 5×5 пикселей. Результатом фильтрации является подавление шума цветных изображений. Это существенно влияет на качество сглаживания карты глубины, поскольку при сглаживании карты глубины используют взвешенные усреднения соседних пикселей, при этом веса вычисляются на основе цветных изображений. Следующий этап способа сглаживания глубины - анализ и обрезка гистограммы опорной карты глубины (Шаг 402). Обрезка гистограммы выполняется с целью подавления шумов карты глубины. Необработанная карта глубины может содержать значительное число аномальных пикселей. Шум может появиться из-за неверного совмещения в окклюзионных областях и на участках со слабо выраженной текстурой. Заявляемый способ предусматривает использование двух пороговых значений: порог В в нижней части гистограммы и порог Т в верхней части гистограммы. Эти пороги вычисляются автоматически из заданных чисел α и β % для аномальных пикселей, где α соответствует отношению пикселей изображения, которые лежат ниже обреза гистограммы, ко всем пикселям изображения. И β соответствует отношению пикселей изображения, которые лежат выше верхнего обреза гистограммы, ко всем пикселям изображения. При этом пороги В и Т вычисляют следующим образомFigure 4 illustrates in detail the inventive method of smoothing a depth map based on recursive filtering. Consider a phased implementation of the method of smoothing depth, based on recursive filtering. The first step (Step 401) consists of preprocessing color images. Preliminary processing consists in filtering color images with a Gaussian filter in a small window, for example, 5 × 5 pixels. The result of filtering is noise suppression of color images. This significantly affects the quality of smoothing the depth map, since when smoothing the depth map, weighted averages of neighboring pixels are used, and the weights are calculated based on color images. The next stage of the depth smoothing method is the analysis and cropping of the histogram of the depth reference map (Step 402). Trimming the histogram is performed to suppress the noise of the depth map. A raw depth map can contain a significant number of abnormal pixels. Noise may occur due to improper combination in the occlusal areas and in areas with a weak texture. The inventive method involves the use of two threshold values: threshold B in the lower part of the histogram and threshold T in the upper part of the histogram. These thresholds are calculated automatically from the given numbers α and β% for abnormal pixels, where α corresponds to the ratio of image pixels that lie below the histogram edge to all pixels of the image. And β corresponds to the ratio of image pixels that lie above the top of the histogram to all pixels in the image. In this case, the thresholds B and T are calculated as follows

где Н(с) - значение гистограммы;where H (s) is the value of the histogram;

М - максимальный уровень пикселя, для однобайтового представления это значение равно 255;M - maximum pixel level, for a single-byte representation, this value is 255;

N_x - ширина изображения;N _x is the image width;

N_y - высота изображения;N _y - image height;

Пример порогов, соответствующих α=β=5% пикселей изображения, показан на Фиг.6. В этом случае В имеет значение 48, а Т - 224. Еще один пример обрезки гистограммы карты глубины представлен на Фиг.7. Гистограмма на Фиг.7 построена с использованием всех данных изображения. Однако для реализации на аппаратных средствах локальная гистограмма может вычисляться на основе информации, содержащейся в настоящее время в линиях памяти.An example of thresholds corresponding to α = β = 5% of the image pixels is shown in FIG. 6. In this case, B is 48, and T is 224. Another example of trimming the depth map histogram is shown in FIG. 7. The histogram of FIG. 7 is plotted using all image data. However, for hardware implementation, a local histogram can be calculated based on the information currently contained in the memory lines.

Следующий этап способа сглаживания глубины заключается в проверке консистентности (единообразия) карты глубины (Шаг 403). Процедура выявляет «консистентные» пиксели, то есть пиксели, для которых карта глубина вычислена правильно. Предложенный способ сглаживания карты глубины основан на перекрестной проверке (crosschecking), позволяющей выявить аномальные пиксели.The next step in the depth smoothing method is to check the consistency (uniformity) of the depth map (Step 403). The procedure identifies "consistent" pixels, that is, pixels for which the depth map is calculated correctly. The proposed method for smoothing the depth map is based on crosschecking, which allows to identify abnormal pixels.

Согласно Фиг.8 процедура реализуется следующим образом:According to Fig.8, the procedure is implemented as follows:

1. Вычисляют вектор опорной карты диспарантности (reference disparity vector - RDV) на основе значений опорной карты глубины;1. The reference disparity vector (RDV) vector is calculated based on the values of the depth reference map;

2. Извлекают значение совмещаемой карты глубины, отображенной через RDV;2. Retrieve the value of the combined depth map displayed through the RDV;

3. Вычисляют вектор совмещаемой карты диспарантности (matching disparity vector - MDV) на основе значений совмещаемой карты глубины;3. The matching disparity vector (MDV) vector is calculated based on the values of the combined depth map;

4. Вычисляют разность карт диспарантности (disparity difference - DD) абсолютных величин RDV и MDV;4. The difference of disparity difference (DD) maps of the absolute values of RDV and MDV are calculated;

5. Если DD превышает пороговое значение, то пиксель опорной карты глубины помечают как аномальный.5. If DD exceeds the threshold value, then the pixel of the depth reference map is marked as abnormal.

Пример карты глубины с помеченными «шумовыми» пикселями согласно процедуре перекрестной проверки глубины приведен на Фиг.9, вид 9.3. Для получения результата на Фиг.9, вид 9.3 пороговое значение для перекрестной проверки карты диспарантности установлено как два, а «шумовые» пиксели маркированы как ноль.An example of a depth map with marked "noise" pixels according to the procedure for cross-checking depth is shown in Fig. 9, view 9.3. To obtain the result in Fig. 9, view 9.3, the threshold value for cross-checking the disparity map is set to two, and the "noise" pixels are marked as zero.

Следующий этап способа сглаживания глубины состоит в выполнении двоичной сегментации опорного цветного изображения на участки с ярко выраженной и слабо выраженной текстурой (Шаг 404). С этой целью вычисляют градиенты в четырех направлениях. Эти направления - горизонтальное, вертикальное и диагональные. Градиенты вычисляют как сумму абсолютных разностей соседних пикселей в соответствующих направлениях. Если значения всех градиентов ниже, чем предопределенное пороговое значение, то такие пиксели рассматриваются как обладающие слабо выраженной текстурой, в противном случае они рассматриваются как обладающие ярко выраженной текстурой. Это можно сформулировать следующим образомThe next step of the depth smoothing method is to perform binary segmentation of the reference color image into areas with a pronounced and weakly pronounced texture (Step 404). For this purpose, gradients are calculated in four directions. These directions are horizontal, vertical and diagonal. Gradients are calculated as the sum of the absolute differences of neighboring pixels in the respective directions. If the values of all the gradients are lower than the predetermined threshold value, then such pixels are considered as having a weakly pronounced texture, otherwise they are considered as having a pronounced texture. This can be formulated as follows

где BS является двоичной маской сегментации для пикселя с координатами (х, y), значение 255 соответствует пикселю изображения со слабо выраженной текстурой, в то время как значение 0 соответствует пикселю с ярко выраженной текстурой.where BS is a segmentation binary mask for a pixel with coordinates (x, y), a value of 255 corresponds to a pixel of an image with a weak texture, while a value of 0 corresponds to a pixel with a pronounced texture.

Фиг.10 иллюстрирует пример цветного изображения с двоичной маской сегментации. Фиг.10 показывает, что применяемый способ сегментации успешно сегментирует изображение на участках со слабо выраженной текстурой и участках с ярко выраженной текстурой. В то же время способ выявляет границы предмета, например, между скатом и морским дном (маркированы темными овалами). Это - важная особенность, так как границы предмета должны быть обработаны тщательно на карте глубины, чтобы избежать появления артефактов.10 illustrates an example of a color image with a binary segmentation mask. Figure 10 shows that the applied method of segmentation successfully segments the image in areas with a weak texture and areas with a pronounced texture. At the same time, the method reveals the boundaries of the subject, for example, between the ramp and the seabed (marked with dark ovals). This is an important feature, since the boundaries of the subject must be carefully processed on the depth map to avoid artifacts.

После двоичной сегментации левого цветного изображения в областях со слабо выраженной текстурой и с ярко выраженной текстурой способ сглаживания переходит к основному циклу фильтрации (Шаги 405-408). В начале обработки индекс итераций устанавливают на ноль. Затем его повышают после каждой итерации сглаживания. Когда индекс становится равным числу итераций, цикл фильтрации открывается.After binary segmentation of the left color image in areas with a weakly pronounced texture and with a pronounced texture, the smoothing method proceeds to the main filtering cycle (Steps 405-408). At the beginning of the processing, the iteration index is set to zero. Then it is increased after each iteration of smoothing. When the index becomes equal to the number of iterations, the filter cycle opens.

Прежде чем это случится, следующий этап способа сглаживания глубины выявляет тип пикселя в соответствии с двоичной маской сегментации (Шаг 406). Если пиксель признается обладающим ярко выраженной текстурой, то применяют фильтр сглаживания карты глубины с настройками по умолчанию (Шаг 408). В противном случае, пиксель признают обладающим слабо выраженной текстурой, и применяют фильтр сглаживания карты глубины с настройками для более сильного сглаживания, обеспечивающими лучшее подавление шумов (Шаг 407).Before this happens, the next step in the depth smoothing method reveals the type of pixel in accordance with the binary segmentation mask (Step 406). If a pixel is recognized as having a pronounced texture, then a depth map anti-aliasing filter with default settings is applied (Step 408). Otherwise, the pixel is recognized as having a weak texture, and a depth map anti-aliasing filter is applied with settings for stronger anti-aliasing that provide better noise reduction (Step 407).

Алгоритм работы сглаживающего фильтра карты глубины представлен на Фиг.11. При этом поясняется, каким образом его можно эффективно применить в аппаратных средствах. Алгоритм работает с буферами памяти, в которые записывают исправления локальных изображений, а не все изображение целиком. В таблице приведены буферы памяти, используемые в описании алгоритма.The operation algorithm of the smoothing filter of the depth map is presented in Fig.11. It explains how it can be effectively applied in hardware. The algorithm works with memory buffers, which write corrections to local images, and not the entire image. The table shows the memory buffers used in the description of the algorithm.

ТаблицаTable Буферы памятиMemory buffers Индекс буфера памятиMemory buffer index Описание сохраняемых (запоминаемых) данныхDescription of stored (stored) data Размер буфераBuffer size 1one Локальная область из опорного цветного изображенияLocal area from the reference color image Размер ядра * Число строк * Число цветовых каналовKernel size * Number of lines * Number of color channels 22 Локальная область из опорной карты глубиныLocal area from the reference depth map Размер ядра * Число строкKernel Size * Number of Rows 33 Пиксели из совмещаемого цветного изображения, отображенные вектором опорной карты диспарантностиPixels from a color matching image displayed by a disparity reference map vector Размер ядра * Число строк * Число цветовых каналовKernel size * Number of lines * Number of color channels

Рассмотрим поэтапную реализацию способа фильтрации для сглаживания карты глубины. На входе алгоритма нужно иметь стереопару цветных изображений (левое и правое) и необработанную карту глубины, вычисленную для одного из цветных изображений. Согласно нашей вышеприведенной классификации изображение из стереопары, для которого выполнено сглаживание карты глубины, называется опорным цветным изображением (reference color image - RCI), в то время как другое изображение называется совмещаемым цветным изображением (matching color image - MCI). Соответственно, сглаженная карта глубины называется опорной картой глубины (reference depth - RD). Мы будем рассматривать левую необработанную карту глубины в качестве опорной карты глубины, а для правой необработанной карты глубины обработка является аналогичной. Фиг.11 показывает одну итерацию сглаживания. Если число итераций более одной, то следует обрабатывать все изображение карты глубины, записывать результат в RD память, и затем использовать тот же буфер памяти с обновленными данными на входе.Consider a phased implementation of a filtering method to smooth a depth map. At the input of the algorithm, you need to have a stereo pair of color images (left and right) and an raw depth map computed for one of the color images. According to our above classification, an image from a stereo pair for which a depth map is smoothed is called a reference color image (RCI), while another image is called a matching color image (MCI). Accordingly, a smoothed depth map is called a reference depth map (RD). We will consider the left raw depth map as a reference depth map, and the processing for the right raw depth map is similar. 11 shows one smoothing iteration. If the number of iterations is more than one, then you should process the entire image of the depth map, write the result into RD memory, and then use the same memory buffer with updated data at the input.

Согласно Фиг.11, на первом этапе алгоритма (Шаг 1101) необходимо скопировать область пикселя из опорного цветного изображения (RCI) в память 1 для дальнейшей обработки. Высота окна должна быть равна числу имеющихся строк. На следующем этапе (Шаг 1102) алгоритм копирует пиксели из опорной карты глубины (RD) в память 2. Следующая операция заключается в проверке того, является ли пиксель из необработанной карты глубины аномальным или нет (Шаг 1103). С этой целью используются пороговые значения В и Т, вычисленные путем анализа гистограммы.According to FIG. 11, in the first step of the algorithm (Step 1101), it is necessary to copy the pixel region from the reference color image (RCI) to memory 1 for further processing. The height of the window should be equal to the number of available lines. In the next step (Step 1102), the algorithm copies the pixels from the reference depth map (RD) to memory 2. The next step is to check whether the pixel from the raw depth map is abnormal or not (Step 1103). For this purpose, threshold values B and T are used, calculated by analyzing the histogram.

Уравнение для проверки диапазона карты глубины выглядит следующим образом:The equation for checking the range of the depth map is as follows:

где d(x, y) является пикселем необработанной карты глубины с координатами (х+х1, y+y1);where d (x, y) is a pixel of the raw depth map with coordinates (x + x1, y + y1);

(х, y) - координаты изображения текущего пикселя карты глубины, для которого выполняют фильтрацию, в то время как x1, y1 - это индексы пикселей опорной карты глубины, записанные в памяти 2.(x, y) are the image coordinates of the current pixel of the depth map for which filtering is performed, while x1, y1 are the pixel indices of the reference depth map recorded in memory 2.

Если неравенство (1) не выполняется, то соответствующий пиксель карты глубины d (х+х1, y+y1) не принимают в расчет для фильтрации пикселя d (х, y) (Шаг 1104), и следующий пиксель из памяти 2 подвергают проверке на предмет аномальности, выполняя такую операцию до тех пор, пока все пиксели из памяти 2 не будут проверены. Если все пиксели будут признаны аномальными, текущий пиксель карты глубины оставляют нетронутым. Как правило, необработанная карта глубины содержит большое число ошибочных пикселей. Чтобы способствовать эффективной фильтрации таких областей фильтром с малым окном, мы применяем рекурсивный фильтр, то есть результат фильтрации текущего пикселя записываем в исходную карту глубины. Это позволяет распространять правильные значения карты глубины на ошибочные области.If inequality (1) is not satisfied, then the corresponding pixel of the map of depth d (x + x1, y + y1) is not taken into account for filtering the pixel d (x, y) (Step 1104), and the next pixel from memory 2 is checked for the subject of abnormality, performing such an operation until all the pixels from memory 2 are checked. If all pixels are considered abnormal, the current pixel of the depth map is left untouched. Typically, a raw depth map contains a large number of erroneous pixels. To facilitate effective filtering of such areas by a filter with a small window, we use a recursive filter, that is, the result of filtering the current pixel is written to the original depth map. This allows you to extend the correct depth map values to erroneous areas.

Следующий этап алгоритма заключается в вычислении значений карты диспарантности на основе пикселей карты глубины, записанных в памяти 2 (Шаг 1105). После вычисления векторов карты диспарантности из значений карты глубины используют соответствующие карты диспарантности в качестве координат для цветных пикселей в совмещаемом цветном изображении (matching color image - MCI). После чего пиксели из MCI, представленные картой диспарантности, копируют в память 3 (Шаг 1106).The next step of the algorithm is to calculate the disparity map values based on the pixels of the depth map recorded in memory 2 (Step 1105). After calculating the disparity map vectors from the values of the depth map, the corresponding disparity maps are used as coordinates for color pixels in a matching color image (MCI). After that, the pixels from the MCI represented by the disparity card are copied to memory 3 (Step 1106).

Согласно Фиг.12, главная идея заявляемого сглаживателя карты глубины состоит в том, чтобы уточнить необработанную опорную карту глубины путем применения взвешенного усреднения пикселей карты глубины, расположенных в окне фильтра (1201). Веса фильтра вычисляют, используя информацию, получаемую от цветных изображений. На Фиг.12 текущий пиксель, в отношении которого была выполнена фильтрация, маркирован черным цветом (1202, 1203, 1204). Во всех изображениях (12.1-RCI, 12.2-MCI, 12.3-RD) пространственные координаты этого пикселя одинаковы. Для вычисления веса сглаживающий фильтр должен вычислить два цветовых расстояния. Ниже дано описание процедуры вычисления таких расстояний.According to Fig. 12, the main idea of the inventive depth map smoothing device is to refine the raw reference depth map by applying weighted averaging of pixels of the depth map located in the filter window (1201). Filter weights are calculated using information obtained from color images. 12, the current pixel with respect to which filtering has been performed is marked in black (1202, 1203, 1204). In all images (12.1-RCI, 12.2-MCI, 12.3-RD) the spatial coordinates of this pixel are the same. To calculate the weight, the smoothing filter must calculate two color distances. The following is a description of the procedure for calculating such distances.

Первое цветовое расстояние между текущим цветным пикселем Iс (1202) и опорным пикселем Ir (1205) вычисляют на Шаге 1107 алгоритма сглаживания карты глубины. Оба пикселя занесены в память 1. Текущая метрика представляет собой эвклидово расстояние и вычисляется следующим образомThe first color distance between the current color pixel IC (1202) and the reference pixel Ir (1205) is calculated in Step 1107 of the depth map smoothing algorithm. Both pixels are stored in memory 1. The current metric is the Euclidean distance and is calculated as follows

где квадратичная разность каждого цветового канала суммируется, после чего из нее извлекают квадратный корень. На Фиг.12 стрелка показывает, между какими пикселями производится вычисление 1-го цветового расстояния.where the quadratic difference of each color channel is summed, after which the square root is extracted from it. 12, an arrow shows which pixels the 1st color distance is calculated between.

Следующий этап алгоритма заключается в вычислении второго цветового расстояния между опорным пикселем Ir (1205) и конечным пикселем It (1206) (Шаг 1108). Конечный пиксель - это пиксель в совмещаемом изображении, который отображен вектором карты диспарантности пикселя Ir. Поскольку эта карта диспарантности является одномерной, то есть горизонтальной картой диспарантности, то Ir и It будут лежать на строчках с одинаковыми индексами (1207) (Фиг.12).The next step of the algorithm is to calculate the second color distance between the reference pixel Ir (1205) and the final pixel It (1206) (Step 1108). A final pixel is a pixel in a composite image that is displayed by the Ir pixel disparity map vector. Since this disparity map is one-dimensional, that is, a horizontal disparity map, Ir and It will lie on the lines with the same indices (1207) (Fig. 12).

Уравнение (2) может быть использовано для вычисления цветового расстояния. На Фиг.12 стрелка показывает, между какими пикселями вычисляется второе цветовое расстояние.Equation (2) can be used to calculate the color distance. 12, an arrow indicates between which pixels the second color distance is calculated.

После завершения вычисления двух цветовых расстояний выполняют вычисление веса пикселя опорной карты глубины следующим образом (Шаг 1109)After the calculation of the two color distances is completed, the pixel weight of the depth reference map is calculated as follows (Step 1109)

где С () означает функцию, используемую для сравнения цвета пикселей,where C () means the function used to compare the color of the pixels,

е - число Эйлера с численным значением 2,718…,e is the Euler number with a numerical value of 2.718 ...,

σ_r является параметром сглаживания карты глубины для опорного пикселя в опорном изображении,σ _r is the smoothing parameter of the depth map for the reference pixel in the reference image,

σ_t является параметром сглаживания карты глубины для конечного пикселя в совмещаемом изображения,σ _t is the smoothing parameter of the depth map for the final pixel in the combined image,

(x_r, y_r) - координаты опорного пикселя,(x _r , y _r ) - coordinates of the reference pixel,

(x_t, y_t) - координаты конечного пикселя. Для одномерной карты диспарантности y_t=y_r.(x _t , y _t ) - coordinates of the final pixel. For a one-dimensional disparity map, y _t = y _r .

По завершении вычислений веса для каждого пикселя опорной карты глубины, алгоритм переходит к вычислению взвешенного усреднения (Шаг 910). Значение взвешенного усреднения вычисляют следующим образомUpon completion of the weight calculations for each pixel of the depth reference map, the algorithm proceeds to the calculation of the weighted averaging (Step 910). The weighted averaging value is calculated as follows

где d_out(x_c, y_c) означает результат сглаживания карты глубины для текущего пикселя с координатами (x_c, y_c), where d _out (x _c , y _c ) means the result of smoothing the depth map for the current pixel with coordinates (x _c , y _c ),

d_in(x_r, y_r) означает необработанную карту глубины для опорного пикселя с координатами (x_r=x_c+p, y_r=y_c+s),d _in (x _r , y _r ) means the raw depth map for the reference pixel with coordinates (x _r = x _{c + p} , y _r = y _{c + s} ),

w_r обозначает вес пикселя опорной карты глубины,w _r denotes the pixel weight of the depth reference map,

индекс р изменяется от

до

в направлении X,index p varies from

before

in the direction of X,

индекс s изменяется от

до

в направлении Y,index s varies from

before

in the direction of Y

нормирующий множитель вычисляют как

.the normalization factor is calculated as

.

И, наконец, результат фильтрации d_out(x_c, y_c) заносят в память RD (Шаг 1111).And finally, the filter result d _out (x _c , y _c ) is stored in the RD memory (Step 1111).

После выполнения определенного числа итераций по сглаживающему фильтрованию карты глубины алгоритм сглаживания карты глубины переходит к постобработке опорной карты глубины (Шаг 909). В заявляемом способе для этого используют медианный фильтр, позволяющий эффективно удалять импульсные шумы карты диспарантности. После того как в процессе постобработки будет сглажена опорная карта глубины, ее заносят в память RD (Шаг 1110).After performing a certain number of iterations on smoothing filtering of the depth map, the algorithm for smoothing the depth map proceeds to the post-processing of the reference depth map (Step 909). In the inventive method, a median filter is used for this, which makes it possible to efficiently remove impulse noise of the disparity card. After the reference depth map is smoothed out during post-processing, it is entered into the RD memory (Step 1110).

Для устранения эффекта мерцания (дребезга) во время просмотра трехмерного видео к карте глубины может быть применен временной фильтр в форме скользящего среднего. Фильтр использует несколько сглаженных карт глубины, которые хранятся в кадровом буфере 351, и на выходе дает отфильтрованный кадр карты глубины для текущей отметки времени.To eliminate the effect of flicker (bounce) while watching a three-dimensional video, a time filter in the form of a moving average can be applied to the depth map. The filter uses several smoothed depth maps, which are stored in the frame buffer 351, and at the output gives a filtered depth map frame for the current time stamp.

Фиг.13 демонстрирует некоторые результаты применения заявляемого способа. Результаты получены при использовании семи линий памяти с тремя итерациями алгоритма. Из этих примеров видно, что предложенный способ дает возможность улучшить даже очень зашумленную карту глубины всего за несколько итераций, работая в локальном окне.Fig.13 shows some results of the application of the proposed method. The results were obtained using seven memory lines with three iterations of the algorithm. It can be seen from these examples that the proposed method makes it possible to improve even a very noisy depth map in just a few iterations, working in a local window.

Основной областью промышленного применения является блок обработки контента в устройствах трехмерного телевидения. В настоящее время одна из главных задач в трехмерном телевидении - снять жалобы пользователей, связанные с утомляемостью глаз при просмотре телевизионных передач. Основная причина возникновения такой проблемы состоит в том, что люди имеют индивидуальные отличия и предпочтения в восприятии стереоскопических изображений. Их пол, возраст, раса могут оказывать влияние на их предпочтения в стереоскопии, поскольку каждый индивидуум является неповторимым в своих уникальных характеристиках системы визуализации. Несоответствующий контент при передаче стереопоследовательностей приводит к усталости глаз. Это комплексное явление, зависящее от многих параметров, таких как высокие значения параллакса, перекрестные помехи, конфликт между сигналами глубины, и так далее.The main area of industrial application is the content processing unit in three-dimensional television devices. Currently, one of the main tasks in three-dimensional television is to remove user complaints related to eye fatigue when watching television. The main reason for the occurrence of such a problem is that people have individual differences and preferences in the perception of stereoscopic images. Their gender, age, race can influence their preferences in stereoscopy, since each individual is unique in their unique characteristics of the visualization system. Inappropriate content when transmitting stereo sequences leads to eye fatigue. This is a complex phenomenon that depends on many parameters, such as high parallax values, crosstalk, conflict between depth signals, and so on.

Функция управления глубиной для снижения утомляемости глаз может быть реализована по двум различным сценариям. Сначала осуществляют ручную настройку, когда пользователь имеет некоторое средство управления и может переключить параметры согласно его собственным персональным предпочтениям, чтобы просмотр был более удобным для глаз. Второй сценарий - применение некоторой функции, обеспечивающей снижение усталости глаз, позволяющей управлять глубиной и повышать комфорт при просмотре передач трехмерного телевидения. Функция улучшения глубины используется после вычисления карты глубины для предварительной обработки параметров глубины прежде, чем менять карту глубины или демонстрировать новые кадры.The depth control function to reduce eye fatigue can be implemented in two different scenarios. First, manual tuning is performed when the user has some means of control and can switch parameters according to his own personal preferences, so that viewing is more convenient for the eyes. The second scenario is the use of a function that reduces eye fatigue, allows you to control depth and increase comfort when watching three-dimensional television broadcasts. The depth enhancement function is used after calculating the depth map to pre-process the depth parameters before changing the depth map or showing new frames.

Заявляемое изобретение может также найти непосредственное применение в стереокамерах для создания высококачественной и достоверной карты диспарантности/глубины. Помимо этого, заявляемое изобретение может найти применение в многокамерных системах или в специальных устройствах, в которых два отдельных видеопотока формируют стереоскопический поток, и требуется стереосовмещение.The invention may also find direct application in stereo cameras to create a high-quality and reliable map of disparity / depth. In addition, the claimed invention can find application in multi-camera systems or in special devices in which two separate video streams form a stereoscopic stream, and stereo alignment is required.

СсылкиReferences

[1] US patent application No. 2006/0120594, J. C. Kim et al. "Apparatus and Method for Determining Stereo Disparity based on Two-path Dynamic Programming and GGCP".[1] US patent application No. 2006/0120594, J. C. Kim et al. "Apparatus and Method for Determining Stereo Disparity based on Two-path Dynamic Programming and GGCP."

[2] US patent No. 7106899, Z. Zhang, Y. Shan "System and Method for Progressive Stereo Matching of Digital Images".[2] US patent No. 7106899, Z. Zhang, Y. Shan "System and Method for Progressive Stereo Matching of Digital Images".

[3] WO patent application No. 2008/041167, F. Boughorbel, "Method and Filter for Recovery of Disparities in a Video Stream".[3] WO patent application No. 2008/041167, F. Boughorbel, "Method and Filter for Recovery of Disparities in a Video Stream".

[4] D. Scharstein and R. Szeliski. "A Taxonomy and Evaluation of Dense two-frame stereo correspondence Algorithms". In IJCV, volume 47(1), pages 7.42, 2002.[4] D. Scharstein and R. Szeliski. "A Taxonomy and Evaluation of Dense two-frame stereo correspondence Algorithms." In IJCV, volume 47 (1), pages 7.42, 2002.

[5] RU patent application No. 2008144840 "Method and Apparatus for Disparity Estimation and Filtering from Stereo Content" from November 14, 2008 by A. Ignatov, V. Bucha.[5] RU patent application No. 2008144840 "Method and Apparatus for Disparity Estimation and Filtering from Stereo Content" from November 14, 2008 by A. Ignatov, V. Bucha.

[6] S. P. Pollard et al. "PMF: a stereo correspondence algorithm using a disparity gradient constraint". Perception, 14: 449-470, 1985.[6] S. P. Pollard et al. "PMF: a stereo correspondence algorithm using a disparity gradient constraint". Perception, 14: 449-470, 1985.

[7] RU patent application No. 2008140111 "Method of a disparity map refinement and apparatus for implementing same method" from October 10, 2008 by V. Bucha, A. Ignatov.[7] RU patent application No. 2008140111 "Method of a disparity map refinement and apparatus for implementing the same method" from October 10, 2008 by V. Bucha, A. Ignatov.

[8] RU patent application No. 2009110511 "System and method for the three-dimensional video acquisition and reproduction" from March 24, 2009 by A. Ignatov, V. Bucha, M. Rychagov.[8] RU patent application No. 2009110511 "System and method for the three-dimensional video acquisition and reproduction" from March 24, 2009 by A. Ignatov, V. Bucha, M. Rychagov.

[9] US patent application No. 2009/0129667, Y. S. Ho et al. "Device and Method for Estimating Depth Map, and Method for Generating Intermediate Image and Method for Encoding Multi-view Video Using the Same".[9] US patent application No. 2009/0129667, Y. S. Ho et al. "Device and Method for Estimating Depth Map, and Method for Generating Intermediate Image and Method for Encoding Multi-view Video Using the Same."

[10] L. Zhang et al. "Stereoscopic Image Generation Based on Depth Images for 3D TV", IEEE Trans. on Broadcasting, 2005, vol. 51, pp.191-199.[10] L. Zhang et al. "Stereoscopic Image Generation Based on Depth Images for 3D TV", IEEE Trans. on Broadcasting, 2005, vol. 51, pp. 191-199.

[11] US patent application No. 2008/0240549, J. P. Koo et al. "Method and Apparatus for Controlling Dynamic Depth of Stereo-view or Multi-view Sequence Images".[11] US patent application No. 2008/0240549, J. P. Koo et al. "Method and Apparatus for Controlling Dynamic Depth of Stereo-view or Multi-view Sequence Images."

[12] US patent application No. 2007/0047040, Т. Н. На "Apparatus and Method for Controlling Depth of Three-dimensional Image".[12] US patent application No. 2007/0047040, T. N. On "Apparatus and Method for Controlling Depth of Three-dimensional Image".

Claims

1. The method of converting stereo content to reduce eye fatigue when watching a three-dimensional video, including the following operations:
- calculate the original disparity / depth map for the stereo image from three-dimensional video, pre-processing the original color images with a spatial filter configured to eliminate noise;
- perform smoothing of the depth map by performing the following operations:
- analyze and crop the histogram of the reference depth map;
- check the consistency of the depth map;
- form a binary mask of the reference color image according to areas with a pronounced texture and areas with a weakly expressed texture;
- smoothing of reference and combined depth maps by successive iterations of filtering depth maps;
- perform filtering of the reference depth map in accordance with the binary mask of the reference image in areas with a pronounced texture and areas with a weakly expressed texture;
- carry out post-processing of reference and compatible depth maps;
- perform temporary filtering of reference and combined depth maps;
- change the parameters of depth perception in accordance with the assessment of eye fatigue, while the depth perception parameter is represented by the parameter D, which varies from 0 to 1, and the parameter D corresponds to the position of the view for the right eye, and the value 1 corresponds to the configuration of the original stereo image, while value 0 describes the case of a monocular view, when the images for the left and right eyes coincide in space, while the corresponding settings of this parameter are in the range from 0.1 to 1;
- generate a new stereo image in accordance with the parameters of the perception of depth.

2. The method according to claim 1, characterized in that the depth perception parameters are changed in accordance with the wishes of the user.

3. The method according to claim 1, characterized in that the view for the right eye is synthesized using interpolation based on a disparity map, while the position of the view for the right eye is described by parameter D.

4. The method according to claim 3, characterized in that the synthesized view for the right eye is used together with the original image for the left eye to form a modified stereo image having reduced parallax compared to the image of the original stereo view.

5. The method according to claim 1, characterized in that the histogram of the depth map is cut off by threshold values B and T, which are calculated as follows:

,
where H (s) is the value of the histogram; M is the maximum pixel level, which for a single-byte representation is 255; N _x is the width of the plot; N _y is the height of the plot; α is a value characterizing the ratio of image pixels that lie below the edge of the histogram to all pixels of the image; β is a value characterizing the ratio of image pixels that lie above the top of the histogram to all pixels of the image.

6. The method according to claim 1, characterized in that the consistency check of the depth map is performed using a cross-check of the depth map.

7. The method according to claim 1, characterized in that the binary mask of the reference color image is defined as

,
where BS is a binary segmentation mask for a pixel with coordinates (x, y), a value of 255 corresponds to a pixel of an image with a weak texture, and a value of 0 corresponds to a pixel of an image with a pronounced texture, gradients (x, y) are a function for evaluating horizontal gradients , verticals and diagonals, while the gradients are calculated as the sum of the absolute differences of neighboring pixels in the corresponding directions, and the values of the gradients must lie within GradTh to recognize the site as a site with a weakly pronounced texture, in contrast Mr event site acknowledge plot with a pronounced texture.

8. The method according to claim 1, characterized in that the filtering of the disparity map at the k-th iteration is expressed as

,
where d _k (x _c , y _c ) means the depth map at the kth iteration for the current pixel with coordinates (x _s , y _s ), d _k-1 (x _r , y _r ) means the depth map at (k-1 ) th iteration for the reference pixel with coordinates (x _r = x _{c + p} , y _r = y _{c + s} ), w _r (x _r , y _r ) means the weight of the reference pixel, the index p changes from

before

in the X direction, the index s changes from

before

in the Y direction, the normalization factor is calculated

.

9. The method according to claim 8, characterized in that the weight of the depth map filter is calculated as follows:

,
where C () means the function used to compare pixels, e is the Euler number, σ _r is the parameter for controlling the weight of the reference pixel in the reference image, σ _t is the parameter for controlling the weight of the target (final) pixel in the combined image, (x _r , y _r ) - coordinates of the reference pixel, (x _t , y _t ) - coordinates of the target (final) pixel.

10. The method according to claim 9, characterized in that the function used to compare the pixel is defined as

,
where I _T (x _s , y _s ) means the intensity of the current pixel in the corresponding color channel, I _T (x _r , y _r ) indicates the intensity of the reference pixel in the corresponding color channel.

11. The method according to claim 9, characterized in that the filter weight w _r (x _r , y _r ) is zeroed if the corresponding pixel of the depth map is abnormal, using the following ratio:
if ((d (x _r , y _r ) <B) OR (d (x _r , y _r )> T)
w _r (x _r , y _r ) = 0, where d (x _r , y _r ) is the pixel of the depth reference map, w _r (x _r , y _r ) is the weight of the depth depth map, B and T are the threshold values obtained when processing a histogram.

12. The method according to claim 9, characterized in that for the filter parameters σ _r and σ _t apply different settings in accordance with the binary image segmentation in areas with a pronounced texture and with a weakly pronounced texture.

13. The method according to claim 1, characterized in that the post-processing of the depth map is performed using a median filter.

14. The method according to claim 1, characterized in that the temporary filtering is performed using a moving average filter.

15. A system for converting stereo content in order to reduce eye fatigue when watching a three-dimensional video, including
- a unit for calculating and smoothing the depth map, configured to analyze and crop the histogram of the reference depth map, check the consistency of the depth map, generate a binary mask of the reference color image according to areas with a pronounced texture and areas with a weakly expressed texture, smoothing the reference and compatible depth maps by successive iterations of filtering depth maps, filtering the reference depth map in accordance with the binary mask of the reference image in areas with a pronounced Sture and portions with weak structure, post-processing and the reference mating depth maps, temporal filtering the reference and mating depth map;
- depth control unit, configured to change the depth perception parameters in accordance with the assessment of eye fatigue, while the depth perception parameter is represented by the parameter D, which varies from 0 to 1, and the parameter D corresponds to the view position for the right eye, and the value 1 corresponds to the configuration the initial stereo image, while the value 0 describes the case of a monocular view, when the images for the left and right eyes coincide in space, while the corresponding settings of this pair meter lying in the range from 0.1 to 1;
- a visualization unit configured to display the resulting image;
the first output of the unit for calculating and smoothing the depth map is connected to the first input of the visualization unit, the second output of the unit for calculating and smoothing the depth map is connected to the input of the depth control unit, the output of which is connected to the second input of the visualization unit.

16. The system of clause 15, wherein the depth map calculation and smoothing unit includes a preprocessing unit, an initial depth map calculating unit, a depth map smoothing unit, and a time filter unit, wherein the first output of the preliminary processing unit is connected to an input block for calculating the initial depth map, the output of which is connected to the first input of the smoothing block of the depth map, the second output of the preliminary processing unit is connected to the second input of the smoothing block of the depth map, the output of which is It is connected to the input of the temporary filter unit.

17. The system according to clause 16, wherein the pre-processing unit includes a stereo image pre-processing unit and a reference image segmentation unit, wherein the input of the pre-processing unit coincides with the input of the stereo-image pre-processing unit, the output of the stereo-image pre-processing unit is connected to the input block segmentation of the reference image, and the output of the block segmentation of the reference image coincides with the output of the pre-processing unit.

18. The system of clause 16, wherein the initial depth map calculation unit includes a depth reference map calculation unit, a compatible depth map calculation unit, a histogram analysis map of the depth reference map, and a depth map consistency check unit, wherein the first input of the unit the calculation of the initial depth map coincides with the input of the reference depth map calculation unit, the output of the reference depth map calculation unit is connected to the histogram analysis unit of the reference map, the second input of the initial depth map calculation unit coincides with the input of the compatible depth calculation unit, the output of the compatible depth calculation unit is connected to the first input of the depth map consistency check unit, the output of the depth chart reference histogram analysis unit is connected to the second input of the depth map consistency check unit, and the output of the depth map consistency check unit coincides with the output block calculating the original depth map.

19. The system according to clause 16, wherein the smoothing unit of the depth map includes an iteration control unit, a depth filtering unit and a post-processing unit, while the input of the depth map smoothing unit coincides with the input of the iteration control unit, the output of the iteration control unit connected to the input of the depth filtering unit, the output of which is connected to the input of the post-processing unit, the output of which coincides with the output of the smoothing unit of the depth map.

20. The system according to clause 16, wherein the temporary filter block includes a frame buffer and a temporary depth filter block, while the input of the frame buffer coincides with the input of the temporary filter block, the output of the frame buffer is connected to the input of the temporary depth filter block, output which matches the output of the time filter block.