[go: up one dir, main page]

WO2017083509A1 - Système stéréoscopique - Google Patents

Système stéréoscopique Download PDF

Info

Publication number
WO2017083509A1
WO2017083509A1 PCT/US2016/061313 US2016061313W WO2017083509A1 WO 2017083509 A1 WO2017083509 A1 WO 2017083509A1 US 2016061313 W US2016061313 W US 2016061313W WO 2017083509 A1 WO2017083509 A1 WO 2017083509A1
Authority
WO
WIPO (PCT)
Prior art keywords
dimensional images
depth map
display
processing
series
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2016/061313
Other languages
English (en)
Inventor
Craig Peterson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/293,398 external-priority patent/US10148932B2/en
Priority claimed from US15/293,625 external-priority patent/US20170140571A1/en
Priority claimed from US15/293,382 external-priority patent/US10277877B2/en
Priority claimed from US15/293,423 external-priority patent/US10242448B2/en
Priority claimed from US15/293,388 external-priority patent/US20170142395A1/en
Priority claimed from US15/293,514 external-priority patent/US10284837B2/en
Priority claimed from US15/293,458 external-priority patent/US10148933B2/en
Priority claimed from US15/293,527 external-priority patent/US10277880B2/en
Priority claimed from US15/293,499 external-priority patent/US10121280B2/en
Priority claimed from US15/293,433 external-priority patent/US10277879B2/en
Priority claimed from US15/293,445 external-priority patent/US10225542B2/en
Priority claimed from US15/293,565 external-priority patent/US10122987B2/en
Application filed by Individual filed Critical Individual
Publication of WO2017083509A1 publication Critical patent/WO2017083509A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/20Image signal generators
    • H04N13/261Image signal generators with monoscopic-to-stereoscopic image conversion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/128Adjusting depth or disparity

Definitions

  • Two dimensional video content such as obtained with a video camera having a single aperture, is often either projected onto a display screen for viewing or viewed on a display designed for presenting two dimensional content.
  • the resolution of displays has tended to increase, from standard television interlaced content resolution (e.g., 480i), to high definition television content (e.g., 1080i), to 4K definition television content (4K UHD), and even to even higher definition television content (e.g., 8K UHD).
  • standard television interlaced content resolution e.g., 480i
  • high definition television content e.g., 1080i
  • 4K definition television content 4K definition television content
  • 8K UHD even higher definition television content
  • three dimensional (3D) image content including glasses-free and glasses-based three dimensional content, which is thereafter displayed on a suitable display for viewing three dimensional image content.
  • the perception of three dimensional content may involve a third dimension of depth, which may be perceived in a form of binocular disparity by the human visual system. Since the left and the right eyes of the viewer are at different positions, each eye perceives a slightly different view of a field of view. The human brain may then reconstruct the depth information from these different views to perceive a three dimensional view.
  • a three dimensional display may display two or more slightly different images of each scene in a manner that presents each of the views to a different eye of the viewer.
  • a variety of different display technologies may be used, such as for example, anaglyph three dimensional system, passive-polarized three dimensional display system, active-shutter three dimensional display system, autostereoscopic lenticular glasses-free 3D display system, autostereoscopic parallax-barrier glasses-free 3D display system, and head mounted stereoscopic display system.
  • Three dimensional display systems become more readily prevalent the desire for suitable three dimensional content to present on such displays increases.
  • One way to generate three dimensional content is using three dimensional computer generated graphics. While such content is suitable for being displayed, the amount of desirable such three dimensional computer generated content is limited and typically used for animated content.
  • Another way to generate there dimensional content is using three dimensional video camera systems. Likewise, while such video camera content is suitable for being displayed, the amount of desirable such three dimensional content is likewise limited.
  • a preferable technique to generate three dimensional content is using the vast amounts of available two dimensional content and converting the two dimensional content into three dimensional content. While such conversion of two dimensional content (2D) to three dimensional content (3D) conversation is desirable, the techniques are conventionally complicated and labor intensive.
  • FIG. 1 illustrates an exemplary two dimension to three dimension image conversion process.
  • FIG. 2 illustrates an exemplary 2D to 3D image conversion system.
  • FIG. 3 illustrates an exemplary neural network.
  • FIG. 4 illustrates inputs to the neural network.
  • FIG. 5 illustrates a selection of image based inputs to the neural network and the outputs thereof.
  • FIG. 6 illustrates a selection of bit depths associated with a three dimensional image.
  • FIG. 7 illustrates selection of pixels of an image shifted different distances to provide right eye versus left eye displacements derived from estimated depth to create the perception of apparent three dimensional image depths.
  • FIG. 8 illustrates a screen place and a depth space "D".
  • FIG. 9 illustrates a screen plane, a shift "Z", a total shifted depth, and a resulting shifted depth.
  • FIG. 10 illustrates a corresponding left eye displacement view and a right eye displacement view at a first depth plane shifted to a second bit depth in front of the screen plane.
  • FIG. 1 1 illustrates a left eye displacement and a right eye displacement at a first depth plane shifted to a second bit depth in front of the screen plane using a non-linear mapping.
  • FIG. 12 illustrates a left eye and a right eye at a first depth plane shifted to a second pixel depth in front of the screen plane using a plurality of non-linear mappings.
  • FIG. 13 illustrates a depth engine, a depth reprofiling modification mapping, and a rendering engine.
  • FIG. 14 illustrates a video stream processing technique
  • FIG. 1 5 illustrates a left-eye image queue and a right-eye image queue.
  • FIG. 16 illustrates a left image queue and a right image queue receiving a sequence of displaced pixel values.
  • FIG. 17 illustrates a display with pixels and/or sub-pixels and an optical lens element for supporting lenticular glasses-free 3D autostereoscopic multi-view.
  • FIG. 1 8 illustrates a lenticular type imaging arrangement.
  • FIG. 1 9 illustrates a lenticular type sub-pixels under the slanted lenticular lens imaging arrangement.
  • FIG. 20 illustrates an alternate model for computing pixel displacement with examples of a pixel depth behind the screen and a pixel depth in front of the screen.
  • FIG. 21 illustrates a display with a representation of the spacing for the viewer's eyes.
  • FIG. 22 illustrates a display with a representation of the spacing for the viewer's eyes at a further distance from the display than illustrated in FIG. 21 .
  • FIG. 23 illustrates a rendering of a three dimensional image with non-uniform shifting.
  • FIGS. 24A-B illustrates the angular differences of presenting images as a result of the viewer shifting.
  • FIG. 25 illustrates a display with a curved front surface.
  • FIG. 26 illustrates rendering a two dimensional image for advertising.
  • FIG. 27 illustrates a system that includes a lens model.
  • FIG. 28 illustrates a modification of a depth map.
  • FIG. 29 illustrates a plug and play approach for inserting a video processing system with a 2D to 3D conversion system in a typical architecture.
  • One technique to achieve two dimensional (2D) to three dimensional (3D) conversion is using a modified time difference technique.
  • the modified time difference technique converts 2D images to 3D images by selecting images that would be a stereo-pair according to the detected motions of objects in the input sequential images. This technique may, if desired, be based upon motion vector information available in the video or otherwise determined.
  • Another technique to achieve two dimensional (2D) to three dimensional (3D) conversion is using a computed image depth technique.
  • the 3D images are generated based upon the characteristics of each 2D image.
  • the characteristics of the image include, but are not limited to for example, the contrast of different regions of the image, the sharpness of different regions of the image, the chrominance of different regions of the image, and texture of different regions of the image.
  • the hue, the saturation, the brightness, and the texture may be used.
  • the sharpness, contrast, and chrominance values of each area of the input image may be determined.
  • the sharpness relates to the high frequency content of the luminance signal of the input image.
  • the contrast relates to a medium frequency content of the luminance signal of the input image.
  • the chrominance relates the hue and the tone content of the color signal of the input image. Adjacent areas that have close color may be grouped together according to their
  • the image depth may be computed using these characteristics and/or other characteristics, as desired. For example, generally near positioned objects have higher sharpness and higher contrast than far positioned objects and the background image. Thus, the sharpness and contrast may be inversely proportional to the distance. These values may likewise be weighted based upon their spatial location within the image. Other techniques may likewise be used to achieve a 2D to 3D conversion of an input image, including motion compensation, if desired. Referring to FIG. 1 , with a suitable depth map from the 2D to 3D conversion process, a 3D image generation process may be used to generate the 3D images based upon the image depth map.
  • the video content may be stored on a storage system
  • the computing system 110 may use a display 100 as a user interface 160 for selecting three dimensional control parameters for the video content.
  • the control parameters may be used to modify the 2D to 3D conversion process.
  • the computing system may provide the 2D video content and/or control parameters for the 2D to 3D conversion accelerator, as described in detail later.
  • the 2D-3D conversion accelerator then processes the 2D video content, based at least in part on the control parameters provided (if any), to generate 3D video content.
  • the 2D video is provided together with the control parameters from the computing system 110 to the conversion accelerators.
  • the video content may be provided as a single video stream where the left and right images are contained in a single video stream, and/or (2) the video content may be provided as two separate video streams with a full video stream for the left eye's content and a full video stream for the right eye's content.
  • the 3D video content as a result of the conversion accelerator, is rendered on the three dimensional display 140 so that the user may observe the effects of the control parameters in combination with the 2D to 3D conversion accelerator.
  • the user may modify the control parameters, such as by modifying selections on the user interface, for the video content until suitable 3D images are rendered on the three dimensional display 140.
  • the resulting three dimensional content from the 2D-3D conversion accelerator may be provided to the computing system 1 10, which may be stored in a three dimensional video format (e.g., 3D side-by-side, 3D frame-pack, frame-sequential 3D, for subsequent rendering on a three dimensional display.
  • the 2D-3D conversion accelerator is preferably an external converter to the computing system 110.
  • a user assisted conversion from 2D image content to 3D image content is feasible, it tends to be rather cumbersome to convert a substantial amount of such video content. Accordingly, it is desirable in a 3D entertainment device to include a fully automated 2D image content to 3D image content conversion system that provides a high quality output.
  • conversion systems are based upon combining visual analyzing and combining cues to create a depth map of the 2D image.
  • the depth map contains a depth value for each pixel in the image or video frame.
  • a different paradigm preferably includes a neural network, which is an information processing paradigm that is inspired by the way biological nervous systems process information.
  • the neural network brain can be trained to create high quality image depth maps that are more extreme and approximate or mimic what a human could do.
  • the training can result in conversions that are much more complex and sophisticated than a human team might be able to invent manually. The longer you train it the better it gets.
  • the neural-net brain with its weighted synapses of each modeled neuron and other learned parameters can be copied on to a hardware board or microchip and put into consumer or other market devices. These devices might just copy the neural-net, or they might also include on-board training processes such as genetic or back-propagation learning technology to continually improve themselves.
  • the result of the 2D to 3D conversion of images using the neural networks results in a depth estimation of each pixel in an image along with the 2D source image that are then processed using a 3D image render process.
  • any 3D display technology may be used, such as for example, stereo 3D display and multi-view auto stereoscopic display, or even holographic display.
  • the system may process all of the input frames in order or a sub-set thereof.
  • the rendered images may be suitable for glasses- based 3D or glasses-free 3D viewing technologies.
  • the display may also be a projected display, if desired.
  • the neural network includes a number of interconnected computational elements working cooperatively to solve a problem.
  • the neural network may be generally presented as a system of interconnected neurons which can compute values from inputs, and may be capable of learning using an adaptive technique, if desired.
  • the neural network may include the following characteristics. First, it may include sets of adaptive weights, e.g., numerical parameters that are tuned by a learning process. Second, the sets of adaptive weights may be capable of approximating a a wide range of functions of their inputs.
  • the adaptive weights, threshold activation functions may be conceptually considered the connection strengths/function computation on synapses between neurons. Traditionally, activation functions have been implemented with some sort of analog circuit due to their complexity.
  • synapse specific transfer function models may be implemented using a combined math-function and table-driven function.
  • synapse transfer function shapes can also be modified by neural training. Being able to modify the transfer function increases the sophistication of computation that can be performed at a synapse and thereby improves the intelligence of the neural net with less neurons.
  • the neural network, thresholds, and transfer functions perform many functions in collectively and in parallel by units.
  • the neural network may optionally include back propagation, feed forward, recurrent, and genetic learning structures. The neural network technique can achieve a natural appearance for 3D structures similar to what a human might do manually because it can learn by comparing its results with human optimized examples.
  • the first layer is the inputs to the neural network which may be the output from various pre-analyzers including color space conversion, resolution decimation, texture, edges, facial and object detection, etc.
  • the pixel values may be converted to a different format, if desired.
  • Each of the neuron synapses may have a various associated weights, thresholds, and transfer functions associated therewith.
  • Each activation function may be updated and may be unique for each node or synapse.
  • the preferable inputs to the neural network include information that may characterize the image.
  • One of the inputs for an image, or regions of an image thereof are the values of the pixels and the color components thereof. In many cases, the color components thereof are red, blue, green, and the associated magnitudes of the red, blue, green.
  • Other techniques may be used to characterize an image, such as for example, red-blue-green-yellow, hue-saturation-brightness, or YCrCb.
  • the hue, saturation, and/or brightness provide information regarding the color characteristics of the image, it is also desirable to include information related to the nature of the texture of the image. In general, texture characteristics quantify the perceived texture of an image. As such, texture characteristics provide information about the spatial arrangement of color and/or intensities in an image or a selected region of the image.
  • Texture provides indications that an object in an image or frame might be closer.
  • a texture may have it's own 3D depth texture.
  • edges may be determined at point or lines or arches of an image at which the image brightness changes sufficiently sharply.
  • the edge aspects of the image tend to indicate discontinuities in the depth of the image, discontinuities in the surface orientation, changes in material properties, and/or variations in scene illumination.
  • Such structure information may be obtained in a suitable manner, such as through segmentation based techniques.
  • the structural information may be generally related to the identification of items within the image. This structural information may be provided as an input to the neural network to further determine a more accurate depth map.
  • facial features of the image tend to be those regions of the image that are of particular importance to the viewer.
  • the estimation of the depth and/or the rendering may also be based upon updating of the fields and/or system feedback.
  • One technique for training a neural network is to collect a selection of images and associated instrument measured three dimensional depth maps.
  • the output of the processing by the neural network may be graded for accuracy, and the neural network updated accordingly to cause learning.
  • the depth behind the image plane may be broken up into a depth having 256 depths (e.g. , 8 bits).
  • the 8-bit depth map may be referenced from a 255 level being at the plane of the screen .
  • 3D depth may be represented by a range of 0 to 255 or eight bits of resolution.
  • the amount of perceived depth is determined by the amount of horizontal displacement of left and right eye pixels associated with a depth value.
  • the z axis measures distance behind the screen or in front of the screen.
  • There may be an additional control of a z axis offset control where the three dimensional box can be offset on the z axis to be partly or even entirely in front of the screen instead of only behind the screen.
  • the movie "Titanic” was converted to 3D by a team of 300 people and took 18 months.
  • the technique described herein may convert 2D "Titanic" to 3D real-time in less than one frame delay (one sixtieth of a second) and have part of the movie significantly popped out into the viewer's space during the entire movie in a natural easy-to-watch way that creates an enjoyable 3D experience.
  • the technique can do that and output to any type of 3D display that is glasses-based 3D, or glasses-free 3D, or even holographic 3D.
  • a pixel in the picture plane with a depth map pixel corresponding to depth-level 128 may be viewed at such a depth by shifting the pixel for the right eye view to the right by an appropriate distance and shifting the left eye view to the left by an appropriate distance from what would have otherwise been a central location in a two dimensional image.
  • corresponding to 64 may be viewed at such a depth by shifting the pixel in the right eye view to the right by an appropriate distance and shifting the left eye view to the right by an appropriate distance from what would have otherwise been a central location in a two dimensional image. As illustrated in FIG. 7, the central location would be the same for both shifts, namely, a bit depth of 128 and a bit depth of 64. As it may be observed, the greater that the pixel position is horizontally separated in space, one for the left image and one for the right image, the greater the apparent depth of the pixel in the image.
  • the image may be mapped into a depth space based upon a relative location of the front of the screen, which may be considered a "0" point for convenience having a depth of "D", such as 256 levels for an 8-bit depth. It may be desirable to provide the appearance of a substantial portion of the 3D image appearing in front of the plane of the screen for increased visual desirability.
  • the depth map of the image may be shifted by an amount
  • the image may be scaled to shift the image to increase the overall depth of the image in front of the screen plane, such as using a linear or non-linear function Z.
  • the image may be both scaled and shifted, if desired. However, preferably the resulting shifted depth of a pixel is less than the total shifted depth of the image.
  • the shifting of the pixel may be achieved by adding and/or subtracting a depth value of Z and then remapping the pixels to the modified three dimensional depth.
  • a modified mapping may be based upon a generally concave curve-like function whereas the pixel mapping increasingly moves further in front of the display the curve tends to displace the pixels a greater distance. This revised mapping may be used for the entire display or a portion thereof.
  • the image may be separated into a plurality of different regions, such as region 1 , region 2, and region 3.
  • the regions are preferably defined based upon the objects detected in the image, such as for example, using a segmentation based technique, a face based technique, a texture based technique, etc...
  • One of the regions may be a facial region of a person.
  • a different mapping may be used that is selected to enhanced the visual quality for the viewer.
  • the 2D to 3D conversion of images may result in a pixel depth estimation.
  • the data structure may provide a mapping between the input depth map and the output depth map, which accounts for the non-linear optimization of the depth of the image.
  • the optimized depth map is then provided to the 3D image render process (e.g., rendering engine). More than one data structure may be used, if desired, each with different properties. This provides an efficient technique for the mapping for the depth map adjustment.
  • the depth map re-profiling may be performed in accordance with a look-up-table. Each table entry may re-map an input depth value to a modified output value.
  • the depth map re-profiling may be performed in accordance with a formula. Each input depth value may be modified to a modified output value based upon the formula.
  • the depth map re- profiling may be performed in accordance with a non-linear curve (e.g., a mapping between an input depth N and an output depth Y). Each input depth value may be modified to a modified output value based upon the curve.
  • the depth map re-profiling may be performed in accordance with a linear level (e.g., a linear mapping between an input depth N and an output depth Y). Each input depth value may be modified to a modified output value based upon the level.
  • the depth map re-profiling may be performed in accordance with a histogram based technique (e.g., a histogram of values where the lower point may be dragged to stretch the depth towards or away from the back, where the higher point may be dragged to stretch the depth towards or away from the front, and a central point to stretch or compress the depth forward or backward).
  • a histogram based technique e.g., a histogram of values where the lower point may be dragged to stretch the depth towards or away from the back, where the higher point may be dragged to stretch the depth towards or away from the front, and a central point to stretch or compress the depth forward or backward.
  • Each input depth value may be modified to a modified output value based upon the histogram.
  • Display devices tend to include a substantial number of pixels, such as a 4K display having 4096 x 2160 for an iMax movie, or 3840 x 2160 for new 4K UHD TV standard.
  • An 8-bit per color channel 4K UHD video frame requires a buffer memory having a size of approximately 32MB to store an uncompressed frame of data. Using such a large buffer for one or more frames for the neural network tends to be costly and consume significant amounts of power in memory accesses which is problematic for a mobile device which has limited battery life.
  • the depth engine outputs a depth map in a line-by-line manner (or portions thereof).
  • the depth engine may include a limited amount of temporal buffering so that small regions of the image may be processed to determine image characteristics, such as texture, edges, facial regions, etc.
  • One technique to modify the bit depth of the depth map may be as follows.
  • the system may use a small direct memory access memory such as a 256 deep memory where the original depth value is used as an index (address) into the memory which outputs a new depth value.
  • a small direct memory access memory such as a 256 deep memory where the original depth value is used as an index (address) into the memory which outputs a new depth value.
  • the depth engine or the modified depth map may be provided to a FIFO queue of streaming pixels for the left image view and a FIFO of pixels for the right image view that is provided to the rendering engine.
  • the queues may be a combined queue, if desired.
  • the queue is preferably sized to be representative of at least the largest potential displacement plus and minus permitted of a corresponding pair of pixels.
  • the source pixel is displaced from the middle of the fifo based upon the displacement associated with the the pixel's depth map value and z offset control and the specific view. Additional "video effects" displacement offsets can be added to the normal displacement offset to create a variety of special effects or video compensations for the image on specific display technologies.
  • the right pixel may be displaced in the right image queue buffer at an approximate position relative to the left pixel queue buffer.
  • the pixel values may be positioned in an appropriate location within the respective left image queue and the right image queue.
  • an embodiment illustrates one technique to use a displacement technique with a pair of buffers.
  • This particular example is a stereo or two-eye view, but may also be modified for a glasses-free 3D model when more views are desired. In that case, there would often be a row for each view.
  • the depth map or modified depth map may have a pixel value A with a displacement D1.
  • the pixel value A is then included in the left image queue to a pixel position that is left of the original and pixel value A is inserted in the the right image queue to a pixel position that is right of the original by an amount corresponding to D1.
  • This is a stereo 3D example or 2-view autostereo example.
  • the pixel value B is then included in the left image queue and the right image queue offset from the mid-point with a displacement D2.
  • the pixel values of A and B are shifted to the right, the depth map or modified depth map may have a next pixel value of C with a displacement D3.
  • the pixel value C is then included in the left image queue and the right image queue offset from the mid-point with a
  • the pixel values of A, B, and C are shifted to the right, the depth map or modified depth map may have a next pixel value of D with a displacement D4.
  • the pixel value D is then included in the left image queue and the right image queue offset from the mid-point with a displacement D4.
  • the pixel values of A, B, C, and D are shifted to the right, the depth map or modified depth map may have a next pixel value of E with a displacement D5.
  • the pixel value E is then included in the left image queue and the right image queue offset from the mid-point with a displacement D5.
  • the pixel value of A is provided for the left image and the pixel value of D is provided to the right image. This process may be continued, as desired.
  • a typical lenticular autostereoscopic display apparatus 1700 includes a matrix pixel display device comprising a LC (liquid crystal) display panel 1710 having a row and an angular lens column array of display elements 1720 and acting as a spatial light refraction to visually isolate specific views relative to each of a viewer's eyes a backlight 1730 is also illustrated.
  • LC liquid crystal
  • Lenticular elements are provided such as by using a lenticular sheet optical lens with prisms 1740 whose lenticules 1750 (exaggerated in size), include elongate semi- cylindrical lens elements, extend in the column direction of the display panel, parallel to the display element columns. Each lenticule overlying a respective group of two, or more, adjacent columns of display elements.
  • the LCD matrix includes regularly spaced rows and columns of display elements.
  • the display arrangements are arranged as columns of approximately square pixels, where each pixel is composed of a row of red, green, and blue sub-pixels.
  • a group of three of or more sub-pixels (e.g., red, green, and blue) form a pixel of the display.
  • Other structures and arrangements of display elements and optical elements may be used.
  • each lenticule is typically associated with two to four columns of display sub-pixels per pixel row
  • the display sub-pixels in each column provide a vertical segment of a specific eye-view to be rendered.
  • a single prism on the lenticular lens typically has a magnification of 2x to 4x which allows primarily one of the subpixels to be seen from a specific eye-view angle on a specific pixel row. Being that a viewer second eye is at a different horizontal viewing position, it would see a different view and subpixel compared the the first eye. This is what enables the ability to deliver a different view experience to each eye. In multi-view screens which have 7, 8 or 9 views, a viewer can move their head side to side and see various views in each eye that appear like you can see around 3D objects.
  • FIG. 18 the operation of a lenticular type of an imaging arrangement is illustrated.
  • the light source, display panel, and lenticular sheet are illustrated.
  • the arrangement provides three views of each image projected in a different direction.
  • Eye position 1 could be the viewer's right eye
  • Eye position 2 could be a viewer's left eye.
  • Each sub-pixel on a pixel row of the display is driven with information for one specific view, such as for the left or right eye of the viewer.
  • the lenticules may be arranged in a slanted arrangement with respect to the columns of display pixels, that is, their main longitudinal axis is at an angle to the column directions of the display element array.
  • the sub- pixels are labeled with their corresponding view of the multi-view arrangement.
  • some of the pixels are split among a plurality of different lenticules so that part of its light is projected by more than one lenticular.
  • the particular view being observed depends on the location of the viewer, which may be represented as a particular point location.
  • FIG. 20 another technique of calculating the horizontal pixel displacements is illustrated with three different pixel depths, a pixel at a depth position behind the screen versus a couple pixels positions in front of the screen.
  • this modified technique offers popout and depth which are proportionally relative to the viewers distance from the display screen. This improves the 3D experience for a viewer that is further from the display screen.
  • the displacement of the pixels displayed on the screen may be illustrated as S1.
  • the pixel shifts reverse for the eyes and the shift of the pixels for being displayed on the screen may be illustrated as S2.
  • the distance of the shift on the screen varies with the depth behind the display and the depth in front of the display.
  • the shift in the pixel distances should be based upon the distance between the eyes of the viewer.
  • the shift behind S1 tends to vary from the displacement being substantially equal to the distance between the eyes of the viewer (at a distance behind the display nearing infinity) to a displacement of zero with the distance at the display.
  • the displacement in front S2 tends to vary from the displacement being zero with the distance in front of the display being equal to zero to a substantial displacement that increases substantially as the shift gets increasingly closer to the viewing plane. It may be observed, that the shift behind S1 for changes in depth behind the display results in relatively minor shifts compared to the corresponding shifts S2 for changes in the depth in front of the display.
  • the depth map and/or rendering should account for the differences in the rendering with respect to the distance between the eyes of the viewer and the distance that the viewer is from the display screen.
  • This creates a yet more realistic 3D geometries and can facilitate a greater 3D pop-out effect in front of the display.
  • the shift in front S3 tends to vary at an ever increasing manner such that even a minor z axis offset in front of the display results in a substantial displacement of the pixels.
  • each pixel or sub-pixel varies with its relative position with respect to the eyes.
  • the angle to each pixel or sub-pixel also varies depending on which eye the image is being sensed by. Accordingly, in this manner, the position of the particular eye relative to the display is different for each pixel or sub-pixel of the desired view of the display. These different angles impact the quality of the rendered image.
  • the display is illustrated with a representation of the spacing for the viewer's eyes for an auto-stereoscopic display of FIG. 21 with the eyes of the viewer being moved a further distance from the display.
  • the angle to each pixel or sub-pixel varies with its relative position with respect to the eyes and is different than illustrated in FIG. 21.
  • the angle to each pixel or sub-pixel also varies depending on which eye the image is being sensed by and is different than illustrated in FIG. 21. Accordingly, in this manner, the position of the particular eye relative to the display is different for each pixel or sub-pixel of the desired view of the display and is different than illustrated in FIG. 21.
  • the viewer In addition to movement of the viewer in a perpendicular direction to the display, the viewer also tends to move in a parallel direction with respect to the display. As the viewer moves in a horizontal direction with respect to the display, the angle to each pixel or sub-pixel further varies with its respective position with respect to the eyes. In addition, the angle to each pixel or sub-pixel also varies depending on which eye the image is being sensed by. Accordingly, in this manner, the position of the particular eye relative to the display is different for each pixel or sub-pixel of the desired view of the display. These different angles impact the quality of the rendered image. A viewer may see several different views with the same eye as they look from one side of the display to the other.
  • a typical display may have six to nine different views per lenticule.
  • the views from left to right may sequence as 1 , 2, 3, 4, 5, 6, 7, 8.
  • the left eye may be looking at view 4 and the right eye at view 7.
  • the views repeat over and over: 1 , 2, 3, 4, 5, 6, 7, 8, 1 , 2, 3, 4, 5, 6, 7, 8, ...
  • this deadzone tend to disappear using smoothing or blending between views. But when the left eye is looking at view 7 and the right eye is looking at view 3, depth becomes inverted and things that were popped out of the display screen shift to behind the screen. This can cause a warped view of objects that cross this zone. Since the viewer's eye actually sees different views due to different angles between the viewer's eye and the pixel's across the screen, a viewer may see this depth inversion effect in a section of the screen. As the viewer moves side-to-side, the warped zone will move back and forth. This means that it is difficult to find a position where this zone is not seen somewhere on the screen.
  • a horizontal view-stretching transformation can be applied to compensate for angle of the eye relative to pixels from one side of the screen to the other. This provides very wide zones where no (or fewer) deadzone/warpzone can be seen. And when seen the whole screen will warp the same way thereby preserving the integrity of the objects geometries on the screen.
  • This stretching transformation to reduce the effect of dead zones is to stretch the views associated with selected pixels to reduce the dead zones for a particular viewer distance from the screen. In this manner, the views may be stretched for one or more adjacent sub-pixels of the image so that the viewer's eye observes The same view all the way across the screen.
  • the rendering of the three dimensional image on the display it is desirable to modify the rendering of the three dimensional image on the display to account for the angular variations in the rendering of the images on the display from a location in front of the display, such as for example, in front of the center of the display at a distance of 8 times the height of the display.
  • the modification of the image may be the result of effectively expanding the distance between particular pixels of the image (or sub-pixels) and/or effective compressing the distance between particular pixels of the image (or sub-pixels). This expansion and/or compression of the distance is preferably done in a manner that is symmetrically centered with respect to the center of the screen.
  • Ax 2 generally refers to the non-linear shifting with respect to the angle of the viewer with respect to the display.
  • Bx generally refers to a linear shifting amount with respect to the horizontal angle of the viewer with respect to the display.
  • C generally refers to a fixed offset of the entire subpixel array.
  • the pixels are positioned at locations further distant from the center of the display (with the viewer centered in the display) the pixels are increasingly shifted a greater distance. This shifting of the location of the pixels of the image that are to be rendered on the display decreases the perception of the dead zones with respect to the viewer.
  • Displacement transformations can be used to provide many image enhancement features.
  • Two-view parallax barrier displays are commonly used for small mobile displays.
  • a side effect of today's only-two-view autostereo displays is that the sweet spot for seeing 3D can be very narrow. Unless a viewer holds the display just right, they will not see 3D. But, a software driver that uses the mobile device camera could track the position of the viewer's eyes relative to the display, and a constant shift, such as the C parameter in the previous polynomial example, could be added into the displacement calculation. This C would be picked to compensate for the viewer's eye position so that the viewer always sees the 3D sweet spot. This is likewise applicable to non-mobile displays.
  • the display may present a 3D image by presenting the different images to the different eyes of the viewer, as illustrated by the solid lines. If the viewer shifts to a different location, such as shifting to the right, the display may present a modified 3D image by presenting the different views to the different eyes of the viewer, as illustrated by the dashed lines.
  • the image that is presented may be modified as described in relation to FIG. 23.
  • the adjustment may include "C" to shift the image to a more appropriate location to be directly in line with the eyes of the viewer so that the dead zones are reduced.
  • the displacement modifying function parameters may be provided together with video stream that is tailored for a particular video sequence.
  • the displacement modifying function may implemented with a look up table.
  • the display panel may include a curved front surface with pixels defined thereon.
  • the physical location of the pixels are arranged in a curved orientation.
  • the curved displays may be manufactured from liquid crystal material or organic light emitting diodes.
  • the curvature of the display causes different angles between eye and screen at different horizontal locations on the screen than a flat screen.
  • the displacement modifying function can be used to shift/stretch view locations to compensate for the different angles across the screen for any specific distance that the viewer is from the screen.
  • a three dimensional display as a presentation device for advertisements or otherwise. While obtaining such content for presentation using a three dimensional imaging device or otherwise computer generated three dimensional content is possible, those options tend to be relatively expensive.
  • two dimensional images may be of a product, such as large energy drink, from an angle of 45 degrees from a position from above the product.
  • the system may use a suitable technique to convert the 2D image(s) to 3D images.
  • the angular relationship with respect to the object may be used to transform the orientation of the object that is rendered in a 3D image generation process for a display is oriented facing up such as a display that is mounted in a table top.
  • the displacement modifying function could be set to transform objects up and out of the display. Which would make them look holographic sitting on top of the display table.
  • sporting events shot from a know angle could be transformed such that the players look like they are running plays across the top of the display table. For example, based upon the angular relationship used to obtain the two dimensional image content the object may be
  • each of the views for the image may be transformed in a different manner. For example, for the large energy drink the transformation for the right eye view may stretch the image upwardly to the right while the transformation for the left eye view may stretch the image upwardly to the left.
  • This table top transformation often requires only a trapezoid calculation relative to the vertical pixel row position.
  • the two dimensional images may have been captured using a variety of different lenses.
  • a large telephoto lens is used which tends to result in a relatively narrow field of view and, in some cases, result in a relatively shallow depth of field.
  • a wide angle lens is used which tends to result in a relatively wide field of view and, in some cases, a relatively deep depth of field.
  • Each of these lenses tends to result in a different two dimensional image that preferably has different three dimensional characteristics when rendered on a three dimensional display.
  • the displacement modifying function can be used to create this new 3D effect of modeling a particular type of lens.
  • the system may include a lens model, such as, modeling the characteristics of a wide angle lens, a standard lens, a telephoto lens, a fish eye lens, etc.
  • a telephoto lens may be characterized, in part, by a straight and linear rendering of the three dimensional image/frame.
  • a wide angle lens may be characterized, in part, by increasing shifting of pixels away from the center of the display screen the closer they are to the viewer. This is a function of the Z axis position of the pixels in 3D space and the horizontal and possibly the vertical distance of the pixels from the center of the screen.
  • image and/or frame are used interchangeably, and that two or more images and/or frames refers to a video or sequence.
  • the primary concentration of depths of the depth map may be within a range of 30 to 120 out of a range of 0 to 255. However, for many images it is more desirable to have a greater range of depths for the primary concentration so that the images have more depth.
  • a mapping may be used to expand the depth map so that those pixels clustered in the concentrated region are spread out while those pixels that are not clustered in the concentrated region are not expanded to the same extent.
  • This modification of the depth mapping may be based upon a table-mapped or formula mapped remapping of the input depth values to a wider range of values.
  • This modification of the depth mapping may also be based upon a non-linear adjustment of a different amount for different portions of the range.
  • this modification may be based upon determining a central point (or otherwise) of the depths of the pixels for the particular image(s), and adjusting the range based upon the central point and the clustering of the depths of the pixels around the central point. In this manner, the adjustment is adaptive to the particular content of the image(s). In the event that the depth map is not sufficiently compressed, the modification of the depth map may result in compressing the depth map (or a portion of the depth map).
  • the modification applied to the depth map for the image(s) may compress portions of the depth map while expanding other portions of the depth map.
  • the modifications to the depth map may be applied in a different manner to different portions of the depth map for the images. For example, a first region of the image may be applied a first modification and a second region of the image may be applied a second modification.
  • a video system may include suitable type of electronics and input for the two dimensional video content.
  • the video system includes a system-on-a-chip (SOC) with microprocessor, memory, a video processor, video drivers, etc. all in the same microchip. It is not practical to insert additional hardware functions into the middle of the SOC.
  • the output of the video system is a video signal that would otherwise be connected directly to the display panel.
  • this interface between the output of the motherboard and the display is conventionally limited to standard interfaces, such as for example, MIPI, low-voltage differential signaling (LVDS), V-by-One.
  • the 2D to 3D conversion system may then use the techniques described herein to generate 3D video content from the 2D video content. After generating 3D video content, the output of the 2D to 3D conversion system provides a 3D video signal for the display. In this manner, it is relatively plug-and-play to incorporate the 2D to 3D conversion system into existing display technologies and/or computer systems.
  • the video system, display, and 2D to 3D conversion system are included within a 3D display product.
  • the output of the video motherboard is connected to the display with a cable.
  • the cable may include an integrated 2D TO 3D conversion system microchip.
  • an existing cable supporting 2D video content to the display may be replaced by a modified cable supporting the conversion of the 2D video to 3D video, while using the same or similar connectors on the display.
  • the driver of the video system may place control codes into the video bitstream suitable to assist and communicate modifications of the conversion to the 2D to 3D conversion system. Such control codes are preferably encoded into the bitstream in a manner such that they are otherwise not visible on the display.
  • the control codes are received by the 2D to 3D conversion system, and used to modify the conversion parameters. If the software system running on the motherboard detects remote control button functions to change depth, Z position, or other effects, it can pass the request by calling associated functions in a software driver associated with the conversion system. The software driver can then insert control codes in the video stream which can then be decoded by the conversion system and executed properly. This enables the conversion system to not require any special control wires/leads that don't normally exist in a normal display system.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Controls And Circuits For Display Device (AREA)
  • Processing Or Creating Images (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)

Abstract

L'invention concerne un système stéréoscopique.
PCT/US2016/061313 2015-11-13 2016-11-10 Système stéréoscopique Ceased WO2017083509A1 (fr)

Applications Claiming Priority (48)

Application Number Priority Date Filing Date Title
US201562255153P 2015-11-13 2015-11-13
US201562255081P 2015-11-13 2015-11-13
US201562255119P 2015-11-13 2015-11-13
US201562255192P 2015-11-13 2015-11-13
US201562255103P 2015-11-13 2015-11-13
US201562255199P 2015-11-13 2015-11-13
US201562255173P 2015-11-13 2015-11-13
US201562255179P 2015-11-13 2015-11-13
US201562255132P 2015-11-13 2015-11-13
US201562255166P 2015-11-13 2015-11-13
US201562255208P 2015-11-13 2015-11-13
US201562255092P 2015-11-13 2015-11-13
US62/255,103 2015-11-13
US62/255,153 2015-11-13
US62/255,192 2015-11-13
US62/255,179 2015-11-13
US62/255,166 2015-11-13
US62/255,132 2015-11-13
US62/255,119 2015-11-13
US62/255,208 2015-11-13
US62/255,199 2015-11-13
US62/255,092 2015-11-13
US62/255,173 2015-11-13
US62/255,081 2015-11-13
US15/293,398 2016-10-14
US15/293,382 US10277877B2 (en) 2015-11-13 2016-10-14 3D system including a neural network
US15/293,423 US10242448B2 (en) 2015-11-13 2016-10-14 3D system including queue management
US15/293,388 US20170142395A1 (en) 2015-11-13 2016-10-14 3d system including pop out adjustment
US15/293,625 2016-10-14
US15/293,514 US10284837B2 (en) 2015-11-13 2016-10-14 3D system including lens modeling
US15/293,527 US10277880B2 (en) 2015-11-13 2016-10-14 3D system including rendering with variable displacement
US15/293,382 2016-10-14
US15/293,433 US10277879B2 (en) 2015-11-13 2016-10-14 3D system including rendering with eye displacement
US15/293,527 2016-10-14
US15/293,433 2016-10-14
US15/293,565 2016-10-14
US15/293,458 2016-10-14
US15/293,398 US10148932B2 (en) 2015-11-13 2016-10-14 3D system including object separation
US15/293,388 2016-10-14
US15/293,625 US20170140571A1 (en) 2015-11-13 2016-10-14 3d system including rendering with curved display
US15/293,445 US10225542B2 (en) 2015-11-13 2016-10-14 3D system including rendering with angular compensation
US15/293,423 2016-10-14
US15/293,445 2016-10-14
US15/293,514 2016-10-14
US15/293,499 2016-10-14
US15/293,565 US10122987B2 (en) 2015-11-13 2016-10-14 3D system including additional 2D to 3D conversion
US15/293,499 US10121280B2 (en) 2015-11-13 2016-10-14 3D system including rendering with three dimensional transformation
US15/293,458 US10148933B2 (en) 2015-11-13 2016-10-14 3D system including rendering with shifted compensation

Publications (1)

Publication Number Publication Date
WO2017083509A1 true WO2017083509A1 (fr) 2017-05-18

Family

ID=58695251

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/061313 Ceased WO2017083509A1 (fr) 2015-11-13 2016-11-10 Système stéréoscopique

Country Status (1)

Country Link
WO (1) WO2017083509A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643414A (zh) * 2020-05-11 2021-11-12 北京达佳互联信息技术有限公司 一种三维图像生成方法、装置、电子设备及存储介质

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6020931A (en) * 1996-04-25 2000-02-01 George S. Sheng Video composition and position system and media signal communication system
US20060079180A1 (en) * 2004-10-12 2006-04-13 Nokia Corporation Methods, apparatus, systems and computer program products for energy management of short-range communication modules in mobile terminal devices
US7161614B1 (en) * 1999-11-26 2007-01-09 Sanyo Electric Co., Ltd. Device and method for converting two-dimensional video to three-dimensional video
US20080281767A1 (en) * 2005-11-15 2008-11-13 Bernadette Garner Method for Training Neural Networks
US20100165081A1 (en) * 2008-12-26 2010-07-01 Samsung Electronics Co., Ltd. Image processing method and apparatus therefor
US20100245548A1 (en) * 2009-02-20 2010-09-30 Taiji Sasaki Recording medium, playback device, and integrated circuit
US20120069019A1 (en) * 2010-09-22 2012-03-22 Raytheon Company Method and apparatus for three-dimensional image reconstruction
US20130027390A1 (en) * 2011-07-27 2013-01-31 Suhyung Kim Stereoscopic image display device and method for driving the same
WO2013109252A1 (fr) * 2012-01-17 2013-07-25 Thomson Licensing Production d'une image pour une autre vue
US20140035902A1 (en) * 2012-07-31 2014-02-06 Lg Display Co., Ltd. Image data processing method and stereoscopic image display using the same
WO2015026017A1 (fr) * 2013-08-19 2015-02-26 Lg Electronics Inc. Appareil d'affichage et procédé de fonctionnement associé

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6020931A (en) * 1996-04-25 2000-02-01 George S. Sheng Video composition and position system and media signal communication system
US7161614B1 (en) * 1999-11-26 2007-01-09 Sanyo Electric Co., Ltd. Device and method for converting two-dimensional video to three-dimensional video
US20060079180A1 (en) * 2004-10-12 2006-04-13 Nokia Corporation Methods, apparatus, systems and computer program products for energy management of short-range communication modules in mobile terminal devices
US20080281767A1 (en) * 2005-11-15 2008-11-13 Bernadette Garner Method for Training Neural Networks
US20100165081A1 (en) * 2008-12-26 2010-07-01 Samsung Electronics Co., Ltd. Image processing method and apparatus therefor
US20100245548A1 (en) * 2009-02-20 2010-09-30 Taiji Sasaki Recording medium, playback device, and integrated circuit
US20120069019A1 (en) * 2010-09-22 2012-03-22 Raytheon Company Method and apparatus for three-dimensional image reconstruction
US20130027390A1 (en) * 2011-07-27 2013-01-31 Suhyung Kim Stereoscopic image display device and method for driving the same
WO2013109252A1 (fr) * 2012-01-17 2013-07-25 Thomson Licensing Production d'une image pour une autre vue
US20140035902A1 (en) * 2012-07-31 2014-02-06 Lg Display Co., Ltd. Image data processing method and stereoscopic image display using the same
WO2015026017A1 (fr) * 2013-08-19 2015-02-26 Lg Electronics Inc. Appareil d'affichage et procédé de fonctionnement associé

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643414A (zh) * 2020-05-11 2021-11-12 北京达佳互联信息技术有限公司 一种三维图像生成方法、装置、电子设备及存储介质
CN113643414B (zh) * 2020-05-11 2024-02-06 北京达佳互联信息技术有限公司 一种三维图像生成方法、装置、电子设备及存储介质

Similar Documents

Publication Publication Date Title
US10715782B2 (en) 3D system including a marker mode
US11961431B2 (en) Display processing circuitry
US20250016296A1 (en) 3d system
KR20110068870A (ko) 비디오 변환시스템에 있어서의 뎁스맵 생성
US10277879B2 (en) 3D system including rendering with eye displacement
US10122987B2 (en) 3D system including additional 2D to 3D conversion
US10277877B2 (en) 3D system including a neural network
US10939092B2 (en) Multiview image display apparatus and multiview image display method thereof
US20170140571A1 (en) 3d system including rendering with curved display
US10121280B2 (en) 3D system including rendering with three dimensional transformation
US10148933B2 (en) 3D system including rendering with shifted compensation
US10225542B2 (en) 3D system including rendering with angular compensation
WO2017083509A1 (fr) Système stéréoscopique
US10242448B2 (en) 3D system including queue management
US10284837B2 (en) 3D system including lens modeling
US20170142395A1 (en) 3d system including pop out adjustment
US10148932B2 (en) 3D system including object separation
TW201306562A (zh) 用於改善三維顯示品質的方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16864995

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16864995

Country of ref document: EP

Kind code of ref document: A1