[go: up one dir, main page]

WO2017146972A1 - Appareil et procédé de codage d'un contenu à cadence élevée en vidéo à fréquence de trame standard en utilisant un entrelacement temporel - Google Patents

Appareil et procédé de codage d'un contenu à cadence élevée en vidéo à fréquence de trame standard en utilisant un entrelacement temporel Download PDF

Info

Publication number
WO2017146972A1
WO2017146972A1 PCT/US2017/018046 US2017018046W WO2017146972A1 WO 2017146972 A1 WO2017146972 A1 WO 2017146972A1 US 2017018046 W US2017018046 W US 2017018046W WO 2017146972 A1 WO2017146972 A1 WO 2017146972A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
exposure
frame
digital value
pixel digital
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2017/018046
Other languages
English (en)
Inventor
Gregory John Ward
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dolby Laboratories Licensing Corp
Original Assignee
Dolby Laboratories Licensing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corp filed Critical Dolby Laboratories Licensing Corp
Priority to US16/078,490 priority Critical patent/US10991281B2/en
Publication of WO2017146972A1 publication Critical patent/WO2017146972A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/682Vibration or motion blur correction
    • H04N23/684Vibration or motion blur correction performed by controlling the image sensor readout, e.g. by controlling the integration time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/73Circuitry for compensating brightness variation in the scene by influencing the exposure time
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/70Circuitry for compensating brightness variation in the scene
    • H04N23/741Circuitry for compensating brightness variation in the scene by increasing the dynamic range of the image compared to the dynamic range of the electronic image sensors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/50Control of the SSIS exposure
    • H04N25/53Control of the integration time
    • H04N25/533Control of the integration time by using differing integration times for different sensor regions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/50Control of the SSIS exposure
    • H04N25/57Control of the dynamic range
    • H04N25/58Control of the dynamic range involving two or more exposures
    • H04N25/587Control of the dynamic range involving two or more exposures acquired sequentially, e.g. using the combination of odd and even image fields
    • H04N25/589Control of the dynamic range involving two or more exposures acquired sequentially, e.g. using the combination of odd and even image fields with different integration times, e.g. short and long exposures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N25/00Circuitry of solid-state image sensors [SSIS]; Control thereof
    • H04N25/70SSIS architectures; Circuits associated therewith
    • H04N25/76Addressed sensors, e.g. MOS or CMOS sensors
    • H04N25/7795Circuitry for generating timing or clock signals

Definitions

  • the present disclosure relates to capturing and encoding video frames and more specifically to capture and encode high frame rate content and transport it in a standard frame rate video.
  • Moving pictures are an optical trick in which a number of individual still pictures are flashed at a rate faster than the human eye can individually distinguish them as individual photos.
  • This speed of exposure comes at the cost of resolution of the individual pictures due to less photons being captured by the sensor media, and simultaneously, the greater the cost of transport.
  • the slower the speed of exposure the greater the detail that is captured within the individual picture, but this occurs at the expense of blurred motion.
  • an imaging system comprises a pixel image sensor array disposed on a substrate, said pixel image sensor array comprising a plurality of pixels.
  • the imaging system further comprises a multi-stage timer coupled to said pixel image sensor array for triggering exposures of said plurality of pixels, wherein the pixels are grouped into N subsets, and the multi-stage timer is configured to trigger, for each of the N subsets, an exposure sequence of at least two exposures of different capture duration of the pixels of said subset, wherein start times of the exposure sequences of the different subsets are temporally offset by a predetermined offset toffset, and the sequences have the same overall duration T and the predetermined temporal offset toffset is smaller than said overall duration T.
  • the imaging system further comprises at least one analog to digital converter coupled to said pixel image sensor array and configured to convert said at least two exposures of said plurality of pixels of the subsets to pixel digital values, and a memory coupled to said at least one analog to digital converter and configured to store said pixel digital values.
  • the imaging system further comprises a logic circuit coupled to said memory and configured to determine for each pixel of the image sensor array which of the corresponding stored pixel digital values to upload to a video frame.
  • the pixels of the pixel image sensor array are assigned to N
  • Trigger groups for group-wise exposure of the pixels.
  • the exposure sequence is the same for each pixel within a trigger group, while the exposure sequences of different trigger groups are temporally offset, i.e. their start times are temporally offset.
  • the exposure sequences of different trigger groups overlap in time, i.e. their duration is the same and the offset between the start times of the different sequences is smaller than the duration of a sequence.
  • the N subsets are defined by subdividing the pixel image sensor array into subarrays of N pixels, and including in each subset a single pixel from each subarray.
  • a method of imaging comprises triggering an exposure sequence of at least two exposures of different capture duration of the pixels of each of N subsets of a plurality of pixels of a pixel image sensor array, wherein the exposure sequences are triggered in a predetermined order, wherein start times of the exposure sequences of the different subsets are temporally offset by a predetermined offset toffset, and the sequences have the same overall duration T and the predetermined temporal offset toffset is smaller than said overall duration T, converting said at least two exposures of said plurality of pixels of the subsets to pixel digital values, determining for each pixel of the image sensor array which of the corresponding stored pixel digital values to upload to a video frame, and uploading the determined stored pixel digital value to the video frame.
  • a method of image processing comprising receiving a first exposure of a first capture duration from a pixel, receiving a second exposure of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, converting the first exposure to a first pixel digital value, converting the second exposure to a second pixel digital value, multiplying the second pixel digital value based on a ratio of the first capture duration to the second capture duration, storing the first pixel digital value and the second pixel digital value and selecting one of the first pixel digital value and the second pixel digital value to upload to a video frame.
  • a method of encoding an image comprises receiving a first pixel digital value of a first exposure of a pixel, the first exposure having a first capture duration, receiving a second pixel digital value of a second exposure of the pixel, the second exposure having a second capture duration is that is smaller than the first capture duration, optionally less than one half of the first capture duration, comparing the first pixel digital value to the second pixel digital value to determine a pixel digital delta and selecting for upload to a video frame the first pixel digital value if the pixel digital delta is less than a first predetermined threshold and/or selecting for upload to the video frame the second pixel digital value if said pixel digital delta exceeds a second predetermined threshold.
  • a method of encoding comprising receiving a first pixel digital value of a first capture duration from a pixel, receiving a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, multiplying the second pixel digital value based on a ratio of the first capture duration to the second capture duration and selecting one of the first pixel digital value and the second pixel digital value to upload to a video frame based in part on a smooth pursuit vector, wherein the smooth pursuit vector is an estimate of viewer visual tracking.
  • a method of encoding comprises receiving a first pixel digital value of a first capture duration from a pixel, receiving a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, converting the first pixel digital value into a first irradiance, converting the second pixel digital value into a second irradiance, determining a relative irradiance from subtraction of the first irradiance from the second irradiance and selecting the first pixel digital value to upload to a video frame if the absolute value of the relative irradiance is greater than a predetermined threshold.
  • a method of encoding comprises receiving a first pixel digital value of a first capture duration from a pixel, receiving a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, converting the first pixel digital value into a first irradiance, converting the second pixel digital value into a second irradiance, determining a relative irradiance from subtraction of the first irradiance from the second irradiance, mixing the first pixel digital value and the second pixel digital value to forma mixed pixel digital value based on the first irradiance and the second irradiance if the relative irradiance is less than a predetermined threshold and selecting the mixed pixel digital value to upload to a video frame.
  • a method of decoding comprises receiving encoded video frames having a first pixel resolution, wherein each pixel of the encoded video frames corresponds to one of a plurality of subframes, wherein the correspondence between the pixels and the subframes is defined by a temporal offset matrix, extracting subframes overlapping a target output interval from the encoded video frames using the temporal offset matrix, the subframes having a second pixel resolution, upscaling the extracted subframes to the first pixel resolution, wherein the upscaling preferably comprises spatially interpolating, generating a high temporal resolution image by forming a linear combination of the upscaled subframes, preferably by averaging the upscaled subframes, selecting one of the encoded video frames that overlaps the target output interval as a high spatial resolution image, forming downsampled subframes by downsampling the spatially interpolated subframes to the second pixel resolution, forming a downsampled average frame by downsampling and averaging at
  • the norm determined on the basis of the differences may for example be an LI norm, i.e. a sum of the absolute value of the differences, or an L2 norm, i.e. a sum of the squares of the differences.
  • a method of decoding a temporally offset frame comprises interpolating a set of N frames spatially utilizing a set of temporal offset matrix positions corresponding to a target output interval, averaging the N interpolated frames to form a motion image frame, down-sample matching a spatial resolution of each of the N interpolated frames to form a down-sample matched frame, down-sample averaging a set of N original frames utilizing the down-sample matched spatial resolution to form a down-sample averaged frame, determining a set of squared difference values between the down-sample matched frame and the down-sample averaged frame, summing the set of squared difference values to form a summed squared difference frame, up-sampling the summed squared difference frame, resetting the up-sampled summed squared difference values above a predetermined upper threshold to one and below a predetermined lower threshold to zero, determining a mixing image utilizing the reset
  • FIG. 1 is an overview of an example system accordance with one embodiment of the disclosure.
  • FIG. 2 is an example temporal offset matrix in accordance with one embodiment of the disclosure.
  • FIG. 3 is a single frame in accordance with one embodiment of the disclosure.
  • FIG. 4 is a close up of a single frame in accordance with one embodiment of the disclosure.
  • FIG. 5 is a single frame that is temporally dithered at 19 fps rendering in accordance with one embodiment of the disclosure.
  • FIG. 6 indicates areas of motion within a single frame in accordance with one embodiment of the disclosure.
  • FIG. 7 is a single frame that is blurred at 19 fps rendering in accordance with one embodiment of the disclosure.
  • FIG. 8 is a single frame that is blurred at 60 fps rendering in accordance with one embodiment of the disclosure.
  • FIG. 9 is a single frame that is tracked at 60 fps with 80% black point insertion rendering in accordance with one embodiment of the disclosure.
  • FIG. 10 is a single frame that is tracked at 15 fps with 75% black point insertion rendering in accordance with one embodiment of the disclosure.
  • FIG. 1 1 is a single frame that is tracked at 24 fps for a RGB color wheel rendering in accordance with one embodiment of the disclosure.
  • FIG. 12 is an example CMOS capture times and offsets in accordance with one embodiment of the disclosure.
  • FIG. 13 is an example of a method of image processing in accordance with one embodiment of the disclosure.
  • FIG. 14 is a first example of a method of encoding an image in accordance with one embodiment of the disclosure.
  • FIG. 15 is a second example of a method of encoding an image in accordance with one embodiment of the disclosure.
  • FIG. 16 is a third example of a method of encoding an image in accordance with one embodiment of the disclosure.
  • FIG. 17 is a fourth example of a method of encoding an image in accordance with one embodiment of the disclosure.
  • FIG. 18 is an example of a method of decoding an image in accordance with one embodiment of the disclosure.
  • FIG. 19 depicts a 2 x 2 Bayer pattern combined with a 3 x 3 temporal offset dither pattern according to an embodiment of this invention.
  • FIG. 20 depicts an example sensor circuit for capturing image data according to an embodiment of this invention.
  • FIG. 21 depicts an example decoder for decoding and reconstructing output frames according to an embodiment of this invention.
  • Judder occurs when untracked motion is represented by short exposures separated by some time At (e.g., using a 30° shutter that is open for l/12th of the frame time).
  • the moving object flashes one place and then again in a different place, and high-contrast edges or silhouettes appear to flash where their motion should be smooth.
  • Smooth pursuit describes motion that is of interest to and is tracked by the viewer, which is an estimate of viewer visual tracking and may be quantified by a smooth pursuit vector.
  • an eye movement vector is subtracted from a local image motion vector and regions where significant eye and object motion vectors cancel correspond to smooth pursuit.
  • Partially canceling vectors may be used to change the local shutter time proportionally.
  • Judder artifacts may be avoided by using a more open shutter, a 360° shutter may allow motion to be captured during entire frame duration resulting in smearing the motion and causing excessive blurring.
  • Current video content is a compromise between these shutters extremes, and thus less than optimal.
  • High frame-rate video overcomes some of these issues by presenting more frames per second than standard video.
  • Many LCD displays cannot refresh effectively at more than 120 frames per second (fps), and they get less energy-efficient and photon-efficient at higher refresh rates.
  • fps frames per second
  • the instant disclosure may provide a higher perceptible frame rate without the associated cost in bandwidth and display technology.
  • a method of deciding at what rate the pixel captures data and a method of encoding this multi- rate frame content may be selectively blurred or strobed on a standard frame-rate display based on motion estimation, viewer eye movement estimation or viewer eyes tracking in.
  • high frame-rate video reduces an effective dynamic range of the captured image by raising the noise floor in captured content. This is due to the high illumination and/or gain requirements of short exposure capture.
  • the instant disclosure may addresses this with a reprogrammed CMOS sensor that captures sequences of at least two exposures of different duration that are offset in time for different pixels, e.g. long- short-long- short exposure sequences offset in time at the individual pixel.
  • the present disclosure describes a system for capturing, processing and transmitting high frame-rate video within a standard frame rate.
  • a camera sensor captures a short and a long exposure.
  • processing based on a simple decision tree, it is decided whether to utilize the short or long exposure. The decision may be different for individual pixels.
  • the interlaced pattern may be transmitted from an encoder to the decoder.
  • One solution proposed by the instant disclosure is to send high frame-rate information in a standard frame-rate video. Rather than capturing and recording video frames from a single time window, a time-offset array and record adjacent, short-exposure pixels from shifted points in time distributed over the frame's time window was performed.
  • Motion blur may appear as a dither pattern.
  • Rendering for dither correction may entail that the time window may be reduced or expanded in different parts of the image. Where motion is estimated, eye tracking is estimated or the observer is known to visually track an object, the object may be strobed for a sharper appearance.
  • untracked objects may be rendered with full motion blur.
  • the proposed format may improve video frame-rate conversion results, both for slower and for higher frame rates.
  • image sensors By designing image sensors to record long-short-long-short exposure sequences, a low- noise version of this representation in high dynamic range (HDR) may be created while avoiding issues of ghosting normally associated with high dynamic range.
  • HDR high dynamic range
  • the instant disclosure proposes methods to send high frame-rate data in a standard frame-rate video by interlacing short-exposure pixels in a tiled matrix. Displayed on a standard frame-rate device, such video may show dither patterns where there is high-speed motion, which from a distance may be equivalent to a 360° shutter (motion blur).
  • the amount of motion blur may be modified to simulate any equivalent shutter. This selective modification may be different over different parts of the image based on the degree of motion and the expected or measured eye movement for the scene. In regions where there is little motion, pixels may be captured at a lower rate or averaged to reduce the noise associated with high frame-rate sensors. By modifying the capture method, high dynamic range pixels may be obtained as well.
  • FIG. 1 depicts an imaging system 100.
  • the image system comprises a pixel image sensor array 1 10 disposed on a substrate, the pixel image sensor array 1 10 comprising a plurality of pixels 1 12. S x T subsets of said plurality of pixels are defined.
  • a multi-stage timer 1 14 is coupled to the pixel image sensor array 1 10 and configured to trigger, for each of the subsets, a sequence of at least two exposures of different capture duration of the pixels of said subset.
  • the sequences corresponding to different subsets are triggered in a predetermined order, with start times of subsequent sequences being temporally offset by a predetermined offset fset.
  • the sequences have the same overall duration T and the predetermined temporal offset toffset is smaller than said overall duration.
  • An analog to digital converter (ADC) 1 16 is coupled to the pixel image sensor array 112 and converts the at least two exposures of the at least one of the plurality of pixels of the subsets to pixel digital values.
  • a memory 1 18 is coupled to the at least one ADC 1 16 to store the pixel digital values.
  • a logic circuit 120 is coupled to the memory 118 and determines for each pixel of the image sensor array which of the corresponding stored pixel digital values to upload to a video frame. The logic circuit may scale, e.g. multiply, the stored pixel digital values based upon the different exposure durations, e.g. on the basis of a ratio between two capture durations.
  • the stored pixel digital value of the short exposure is scaled by multiplying by k or the store pixel digital value of the long exposure is scaled by multiplying by 1/k.
  • the logic circuit may determine for each pixel which of the corresponding pixel digital values to upload based on a degree of movement on the basis of the stored pixel digital values of the at least two exposures of different capture durations. The degree of movement is determined by a difference threshold between the two measured exposures at a pixel. If the difference is above the threshold determined by the noise level in the shorter exposure plus a local sensitivity adjustment, then the shorter exposure is used for this position.
  • the pixel digital value of the short exposure or the pixel digital value of the long exposure is first scaled, and subsequently the absolute difference between the scaled pixel digital value and the other pixel digital value is calculated. If said difference exceeds a predetermined threshold, which may be indicative of an expected noise level, it is determined that movement was present for this pixel and the pixel digital value corresponding to the shortest exposure is selected for upload.
  • a predetermined threshold which may be indicative of an expected noise level
  • FIG. 2 depicts a pixel population (200) having a well-mixed temporal offset matrix.
  • the pixel image sensor array is divided into a plurality of subarrays of dimension S x T, and the pixels of each subarray are assigned to subsets according to the temporal offset matrix. This has the advantage that the pixels of each subset are distributed substantially uniformly over the sensor area.
  • video is transmitted at a unified frame rate of 30 frames per second (fps), where the shortest exposure duration of the pixels is l/16th of the total frame time, or l/480th of a second.
  • the shortest exposure duration may be the same as the temporal offset between the sequences.
  • the disclosure applies to any frame rate and especially to those that are specified by the MPEG, movie picture experts group.
  • the offset placement in the temporal offset matrix is arbitrary, but the results may show less aliasing if the offsets are well-mixed in the sense that adjacent values including neighboring tiles have good separation in time.
  • the temporal offset matrix is constructed such that any two horizontally adj acent pixels and any two vertically adj acent pixels of the pixel image sensor array are not immediately following each other in the predetermined order in which the sequences trigger these pixels.
  • a simple scan-line ordering is undesirable.
  • the temporal offset matrix may be rotated, shifted, or scrambled on a frame-by-frame basis to reduce the "screen door effect." So long as the temporal offset matrix is sent as metadata with the frame, or derivable from a known sequence with a specified starting point, the offsets are easily recovered. Scrambling the temporal offset matrix on a frame-by-frame basis may allow a dither pattern to be hidden from view. In one example data pipeline, a separate rendering stage would be implemented to take advantage of the higher encoded frame rate.
  • FIG. 3 depicts an interlaced matrix of pixel data in a picture frame in areas having significant motion
  • FIG. 4 is a close-up of this interlaced matrix. The possible mitigation of this blurring is one issue that the instant disclosure is seeking to address.
  • FIGS. 5-11 depict different rendering conditions in accordance with various embodiments of the instant disclosure.
  • FIG. 5 depicts a single temporally dithered video frame (510) taken at 19 fps
  • FIG. 6 shows the motion regions within the video frame, and wherein white regions correspond to areas of higher motion than dark regions
  • FIG. 7 shows a single full blur video frame at 19 fps.
  • FIG. 8 shows a single full blur video frame at 60 fps.
  • FIG. 9 depicts a single tracked video frame at 19 fps with 80% black point insertion
  • FIG. 10 shows a single tracked video frame at 15 fps with 75% black point insertion.
  • FIG. 5 depicts a single temporally dithered video frame (510) taken at 19 fps
  • FIG. 6 shows the motion regions within the video frame, and wherein white regions correspond to areas of higher motion than dark regions
  • FIG. 7 shows a single full blur video frame at 19 fps.
  • FIG. 8 shows a single full blur
  • FIG. 1 1 shows a single 24 fps video frame (1110) generated using an RGB color wheel, such as a digital light processing DLP device, showing an example of color breakup.
  • an RGB color wheel such as a digital light processing DLP device
  • the color ball 11100 originally in yellow and gray, is displayed in color using shades of red (11020) and green (11010, 11030).
  • Embodiments of this invention allow for inserting into a frame temporal information that allows rendering visually tracked objects (such as the ball 11100) without perceived color breakup.
  • N sub-frames may be recorded for video frames transmitted at 30 fps.
  • a bi-cubic interpolation method or the like may be used to up- sample these sub-frames to the overall frame resolution and the result may be a spatially blurred version of video at 480 fps.
  • Trading spatial resolution for temporal resolution is logical for rapidly moving objects the viewer is expected (or observed) to track. Moving objects that are not being tracked by the viewer may be blurred spatially to achieve the equivalent of a 360° shutter. The spatial blur in such regions may be masked by the blur due to motion.
  • Such selective blurring assumes some information about how the viewer' s eyes are tracking motion in the scene. This may be provided as metadata that is specified by the director, or averaged eye-tracking measurements from test viewers, from eye-tracking hardware built into the display, from motion estimation based on frame to frame deltas and from eye tracking estimation based on frame to frame deltas and their location within the frame.
  • displays with array backlights may use selective black point insertion (backlight flashing) to improve temporal resolution in smooth pursuit regions, while using more continuous backlight illumination in regions where retinal blur is natural. Regions of smooth pursuit comprise areas where an eye movement vector is similar to a local image motion vector. Partially canceling vectors may be used to change the local shutter time proportionally.
  • a first method is to take the output of a full-resolution, high frame-rate sensor and select the temporal offsets for individual pixels.
  • a second method is to take a lower-resolution, high frame-rate sensor (e.g., 1 ⁇ 4 resolution in x-y dimensions) and use sensor-shifting to capture the individual pixels at their associated times. This assumes the sensor has a low fill factor as is often the case with CMOS designs.
  • a third method employs a full-resolution CMOS sensor in which adjacent pixels are programmed to capture offset long-short-long-short exposure sequences.
  • FIG. 12 depicts an example of CMOS times and offsets in accordance with this third method.
  • the duration of the short exposure corresponds to the temporal offset.
  • the circuit of a CMOS sensor is adapted to group the pixels of each subset together. Therefore, this exemplary CMOS sensor can expose the pixels subset-by-subset instead of row-by-row.
  • the CMOS sensor circuit defines N subsets of pixels by subdividing the pixel image sensor array into subarrays of N pixels and including in each subset a single pixel from each subarray. It is noted that the order of triggering the different subsets is determined by the multi-stage timer and may be reprogrammable. For example, as described herein, the order the subsets are triggered, e.g. defined by a temporal offset matrix, may be changed on a frame-by-frame basis by the multi-stage timer.
  • the long-short exposure sequence may offer two related advantages, low noise and high dynamic range.
  • the output of the analog-to-digital converter (ADC) may be two 12-bit linear values for the captured frame. If the two values agree within the expected noise of the shorter exposure, then the long exposure may be utilized as this indicates that minimal motion occurred within the frame.
  • the output of the long exposure and the short exposure may be at variance.
  • an object may be moving across a given pixel within the frame.
  • motion can be detected if the absolute difference of an exposure value in the short and long exposures is larger than a threshold. Pixel values differ due to the exposure time, exposure values (EV) change if the scene changes or the pixel is saturated.
  • the short exposure may be selected for its temporal accuracy.
  • the number of photons received by the given pixel is high, causing the long exposure to saturate or exposure overflow, i.e., clip.
  • the shorter exposure is selected. In either case, the noise in the short exposure may not be an issue, because either it is masked by scene motion or it has captured a sufficient number of photons to be above the noise floor. If a short exposure is not selected, then the longer exposure is selected.
  • Irradiance is the radiant flux (or power) received by a surface per unit area and radiant exposure is the irradiance of a surface integrated over the time of irradiation.
  • One example of the instant disclosure measures the change in irradiance at a pixel site that is the result of either motion at that position or pixel overflow at that position.
  • the relative irradiance of the pixel site from both the long and short exposure allows differentiation of motion from pixel overflow, in either case the shorter exposure is selected.
  • the threshold corresponding to this relative irradiance may be modified by a local function of recently detected motion, based on a bitmap resulting from previous decisions.
  • the bitmap may be a grayscale image and the selection may comprise a mixing for some range of relative irradiance wherein a fraction of both the short and long exposure values are mixed.
  • a short exposure When a short exposure is selected, it may be multiplied by an appropriate scalar, which is equivalent to shifting the value left by 4 bits in one example, thus, achieving a low- noise 16-bit signal from a 12-bit ADC.
  • the scalar may be based on the ratio of the different capture durations.
  • a compressed bitmap indicating which pixels used long versus short exposures may be useful if the decision is made in the camera' s sensor or SOC. For example, a blurred version of this bitmap could be used to adjust the noise thresholds for the next frame, improving the motion sensitivity by pooling local information.
  • the threshold for the image sensor may benefit from a pooling of information to raise the threshold and thus avoiding noise spikes in regions where no motion was recently seen, and lowering the threshold in image regions where motion is estimated. This may be accomplished by blurring the bitmap of pixels chosen from the short exposure in a previous frame and using this blurred image to adjust the threshold.
  • a high frame-rate CMOS sensor may be used in the capture method described by summing together 15 of the 16 frame pixels and applying the same selection criteria. In this case, it may not be necessary to choose the short exposure for bright pixels, only for those with significant motion. However, the reduction in noise may not be as great, since many sensors are dominated by readout noise rather than shot noise. In one example a readout is performed 15 times for the synthesized long exposures, the noise cancels much less effectively.
  • a perennial problem with video is converting between different frame rates. If frame-rate conversion takes place at display time, one method for selective motion rendering described earlier is altering an integration window to be larger or smaller to accommodate the display's target frame rate. If frame-rate conversion is being performed off-line, an adjustment to the time interpolation matrix may match the longer or shorter frame time. This method allows the option of selectively rendering the video. For example, the matrix may be extended from a 16-entry offset table to a 20-entry table (perhaps in a 4x5 matrix) when converting from 30 to 24 fps. Converting to a higher frame rate may result in a smaller matrix, which may correspond to a slight loss in spatial resolution unless some form of motion interpolation was applied. To facilitate motion interpolation, the detailed temporal information may yield more accurate motion vectors.
  • Video compression relies on estimating image motion vectors, therefore adaptation to existing codecs with minor modifications may be expected.
  • Motion vectors may be computed from the sub-frames of a de-matrixed (de-interlaced) source video. These motion vectors may be useful for compression as well as the adaptive motion rendering described earlier.
  • the sub-frames may also be compressed separately in regions of significant motion to reduce high frequency spatial content, or the frames may be pre-rendered assuming a particular eye motion vector and sent as standard video. The motion rendering and frame-rate conversion may be combined as well.
  • FIG. 13 depicts a method of image processing 1300 that comprises: receiving 1310 a first exposure of a first capture duration from a pixel, receiving 1312 a second exposure of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration and converting 1314 the first exposure to a first pixel digital value.
  • the method also comprises converting 1316 the second exposure to a second pixel digital value, multiplying 1318 the second pixel digital value based on a ratio of the first capture duration to the second capture duration, storing 1320 the first pixel digital value and the second pixel digital value and selecting 1322 one of the first pixel digital value and the second pixel digital value to upload to a video frame.
  • FIG. 14 depicts a first method of encoding an image 1400, which comprises: receiving 1410 a first pixel digital value of a first capture duration from a pixel and receiving 1412 a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration.
  • the method further comprises comparing 1414 the first pixel digital value to the second pixel digital value to determine a pixel digital delta and selecting 1416 for upload to a video frame the first pixel digital value if the pixel digital delta is less than a pre-determined noise value.
  • FIG. 15 depicts a second method (1500) of encoding, comprising: receiving 1510 a first pixel digital value of a first capture duration from a pixel, receiving 1512 a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, multiplying 1514 the second pixel digital value based on a ratio of the first capture duration to the second capture duration and selecting 1516 one of the first pixel digital value and the second pixel digital value to upload to a video frame based in part on a smooth pursuit vector, wherein the smooth pursuit vector is an estimate of viewer visual tracking.
  • FIG. 16 depicts a third example method of encoding (1600), comprising: receiving 1610 a first pixel digital value of a first capture duration from a pixel, receiving 1612 a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, converting 1614 the first pixel digital value into a first irradiance, converting 1616 the second pixel digital value into a second irradiance, determining 1618 a relative irradiance from subtraction of the first irradiance from the second irradiance and selecting 1620 the first pixel digital value to upload to a video frame if the absolute value of the relative irradiance is greater than a predetermined threshold.
  • FIG. 17. depicts a fourth example method (1700) of encoding, comprising: receiving 1710 a first pixel digital value of a first capture duration from a pixel, receiving 1712 a second pixel digital value of a second capture duration from the pixel, wherein the second capture duration is less than one half of a temporal length of the first capture duration, converting 1714 the first pixel digital value into a first irradiance, converting 1716 the second pixel digital value into a second irradiance, determining 1718 a relative irradiance from subtraction of the first irradiance from the second irradiance, mixing 1720 the first pixel digital value and the second pixel digital value to forma mixed pixel digital value based on the first irradiance and the second irradiance if the relative irradiance is less than a predetermined threshold and selecting 1722 the mixed pixel digital value to upload to a video frame.
  • FIG. 18 depicts an example method (1800) of decoding a temporally offset frame, e.g. a frame of a video encoded as described above with respect to FIGS. 13-17, the method of decoding (1800) comprising interpolating 1810 a set of Q frames spatially utilizing a set of temporal offset matrix positions corresponding to a target output interval, averaging 1812 the Q interpolated frames to form a motion image frame, down-sample matching 1814 a spatial resolution of each of the Q interpolated frames to form a down-sample matched frame and down-sample averaging 1816 a set of Q original frames utilizing the down-sample matched spatial resolution to form a down-sample averaged frame.
  • the method further comprises determining 1818 a set of squared difference values between the down-sample matched frame and the down-sample averaged frame, summing 1820 the set of squared difference values to form a summed squared difference frame, up-sampling 1822 the summed squared difference frame, resetting 1824 the up-sampled summed squared difference values above a predetermined upper threshold to one and below a predetermined lower threshold to zero and determining 1826 a mixing image utilizing the reset summed squared difference values.
  • the method also comprises selecting 1828 a dominant frame that corresponds to a maximum-overlap with the target output interval and combining 1830 the selected dominant frame with the motion image frame based on the mixing image to form an output frame.
  • the letter Q in this example corresponds to how many of the frames recorded in the highest recorded frame rate overlap the target frame interval.
  • the maximum frame rate is 480 fps, i.e. 30x4x4.
  • Q would be 4, corresponding to the number of 480 fps frames in a 120 fps target frame; or 480/120.
  • Q would be 8.
  • an "encoded frame” is a transmitted video frame with an embedded temporal offset matrix
  • a "time slice” or “sub-frame” is a decoded frame corresponding to temporal offset matrix interval
  • a “target output interval” is the period between virtual shutter open and virtual shutter close.
  • a decoder may perform the following:
  • One example of which would be to downsample by a factor of 4 in each dimension for 4x4 temporal offset matrix.
  • 3b downsample and average encoded frames by the same spatial factors used in (3a); 3c. determine the squared differences between each downsampled time slice from (3a) and the single averaged frame from (3b) and add these differences together, or alternately determine the maximum of the squared differences may be utilized.
  • Step 2 comprises spatial interpolation and upscaling.
  • the former may be HD or 4K to match the desired target display resolution and the latter may be one quarter the target resolution in each dimension if we use a 4x4 matrix.
  • Upscaling accounts for matrix positions and interpolation is performed at the maximum frame-rate images using bi-cubic, bilinear interpolation, or other up-scaling methods known in the art, such that the results align properly with each other.
  • step 3a the down-sample dimension is 480x270 resolution in this example.
  • the up-sample dimension is the target resolution, 1920x1080 in this example. Additional image operations before and after up-sampling, such as erosion, dilation, and blurring, may be applied to the mixing image to improve the final results.
  • Additional image operations before and after up-sampling such as erosion, dilation, and blurring, may be applied to the mixing image to improve the final results.
  • the "motion image” may correspond to a high temporal resolution image.
  • the "motion image” may be formed as a linear combination of the time slices, preferably by averaging the spatially interpolated time slices.
  • the "motion image” can be formed by selecting one of the spatially interpolated time slices or forming a weighted average of the spatially interpolated time slices.
  • step 6 as an example one may compute the dominant frame as the input frame with the most temporal overlap with the output frame.
  • the dominant frame may correspond to a high spatial resolution image.
  • step 7 the dominant frame is combined with the average using the mixing image, which corresponds to detected motion in the video. Where there is motion, the average of the Q subframes, i.e. the "motion image" of step 5 is utilized and where there is no motion, the dominant input frame with its higher spatial resolution is utilized.
  • FIG. 21 depicts a simplified version of a decoding process (2100) according to another embodiment. As depicted in FIG. 21 , a decoder may receive input frames (21 10) and associated metadata (2120) such as the interlaced pattern used to construct the input frames, e.g. the temporal offset matrix.
  • the receiver may reconstruct 16 frames, each of size m/4 x n/4.
  • output images (2155) are generated by blending one of the sub-frames with either the input image or another dominant frame.
  • I a denotes a blending, mixing, or mask image with pixel values 0Cj representing the blending coefficient for the j ' -th pixel between the input image and each z ' -th sub-frame, where 0 ⁇ ccj ⁇ 1, then blending may be expressed as
  • Io l j) nU * (1 - 3 ⁇ 4 + h(j) * OLj , (1)
  • l 0 l (j denotes the y ' -th output pixel value for the output frame based on the z ' -th sub-frame
  • n(j denotes the corresponding pixel in the input or dominant frame
  • l t (j) denotes the corresponding pixel in the z ' -th sub-frame.
  • the alpha blending parameters may be proportional to detected motion between the z ' -th sub-frame and the input frame.
  • the output image pixels will represent an up-scaled version of pixels in the z ' -th sub-frame.
  • Up-scaling the z ' -th sub-frame to full resolution typically causes blurring; however, since there is motion, this blurring will be masked by the motion and will be visually imperceptible.
  • the blending parameters are computed in step (2140).
  • the blending parameters may be computed by the following process:
  • ccj may be computed as a function of ax( eij .
  • a.j is lower than a lower threshold
  • cCj may be set to 0, and if ccj is higher than an upper threshold, then ccj may be set to 1.
  • the blending parameters computed at the lower resolution may be further processed to generate the final blending image (e.g., / ⁇ ), at the full (input) resolution.
  • step (2150) performs the blending (e.g., see equation (1)) and generates the corresponding output image.
  • blending may be performed on a combination of the sub-images, which are being offset according to the received interlaced pattern. For example, for an input frame rate of 30 fps and a 4 x 4 pattern, the process described earlier will generate 480 frames per second. If the desired output frame rate is lower, then, before the blending operation, sub-frames may be combined together, for example, using averaging or other interpolation techniques.
  • every four sub-frames may be combined together to a single super sub-frame, and blending may be performed using the input frame and the super sub-frames. In other embodiments, to reduce computations, extra sub-frames may be skipped completely. Alternatively, blending may be performed at the sub-frame level, but then the output frame rate may be adjusted by either combining output frames generated at full frame rate or skipping frames.
  • the exposure range of the temporally dithered capture system described earlier can be further extended by embedding an additional very short exposure within the short exposure period of the short-long- short-long repeated sequence of exposures.
  • the original short-long sequence will typically be separated by just under four "stops” or "exposure values,” which corresponds to a numerical factor of 16. E.g., one may pair a 15/480-th second exposure with a 1/480-th second exposure. This yields a signal-to- noise ratio or dynamic range increase of about 24 decibels.
  • the long-short sequence cannot be easily stretched beyond this because it would introduce excessive noise into moving regions that would be unacceptable. While 24 dB is a good increase, it may not be sufficient to capture the brightest highlights in certain scenes.
  • one or more even shorter exposures as subintervals in the exposure sequence may be added. In doing this, there is no need to alter the temporal offset dither between adjacent pixels in the sensor matrix.
  • One or two readouts can occur 1) reading the signal stored on the floating diffusion region with the gate DCG closed - if the well was not saturated, this will be the traditional high gain signal; 2) readout of the capacitor, enabled by opening the gate associated with DCG. This readout will be a measure of the overflow electrons, in addition to a small fraction of the below-threshold electrons residing in the FD region. Using this approach, one could avoid another shorter exposure, but this would require possibly another readout from each pixel, plus the added gate and capacitor in the unit cell.
  • this architecture could also be used to store two different exposures within the unit cell (one on the capacitor followed by a reset of the FD region, the other on the FD region)
  • the DCG gate would be fully switched on/off in the traditional way, rather than being controlled by a bias voltage.
  • another way to shift the exposure range is to subdivide each of the long-short periods, then sum the multiple exposure readouts into a single value per enclosing period. This may add a few dB to the dynamic range, but more importantly avoids overflow without changing the effective exposure time. Thus, continuity for frame rate retargeting is maintained.
  • the long and short exposure periods should be subdivided by the same divisor, typically into 2, 4, 8, or 16 subintervals that partition the enclosing (long or short) exposure time. There will then occur a corresponding number of readouts, which will be accumulated for each of the long or short exposures into an effective value for each. This technique may be combined with the first method of adding a particularly short exposure to the short period to extend the dynamic range as well as shifting it.
  • FIG. 19 illustrates, without limitation, an example of how a repeating 2x2 Bayer pattern (say, the four-pixel G,B, R,G array 1910, where G,B, and R denote Green, Blue, and Red sensors) can be combined with a 3x3 temporal offset dither (say, the 6, 1,4, 8,3,7, 0,5,2 array 1920) to achieve a 19 dB increase in dynamic range and a 9x increase in potential frame rate using a long-short-long-short exposure sequence.
  • a repeating 2x2 Bayer pattern say, the four-pixel G,B, R,G array 1910, where G,B, and R denote Green, Blue, and Red sensors
  • a 3x3 temporal offset dither say, the 6, 1,4, 8,3,7, 0,5,2 array 1920
  • EEE 1 An imaging system, comprising:
  • a pixel image sensor array disposed on a substrate, said pixel image sensor array comprising a plurality of pixels;
  • a multi-stage timer coupled to said pixel image sensor array to trigger at least two exposures of different capture duration of at least one of said plurality of pixels, wherein said at least two exposures are temporally offset;
  • At least one analog to digital converter coupled to said pixel image sensor array and convert said at least two exposures of said at least one of said plurality of pixels to pixel digital values
  • a memory coupled to said at least one analog to digital converter to store said pixel digital values
  • a logic circuit coupled to said memory to determine which of said stored pixel digital values to upload to a video frame.
  • EEE 2 The imaging system of EEE 1, wherein said logic circuit multiplies said stored pixel digital values based upon at least two temporal lengths of the at least two exposures of different capture duration.
  • EEE 3 The imaging system of EEE 1 or EEE 2, wherein the multi-stage timer is a two stage timer having a first timer sequence and a second timer sequence which is less than one half of a temporal length of the first timer sequence.
  • EEE 4 The imaging system of EEE 1 or EEE 2, wherein the multi-stage time is a four stage timer having a first timer sequence, a second timer sequence, a third timer sequence and a fourth timer sequence, wherein said first timer sequence and said third timer sequence are approximately equivalent in a temporal length and said second timer sequence and said fourth timer sequence are respectively less than one half the temporal length of said first timer sequence and said third timer sequence.
  • EEE 5 The imaging system of any of the EEEs 1-4, wherein the upload of said frame is an MPEG standard rate.
  • EEE 6 The imaging system of any of the EEEs 1-5, wherein the upload of said frame is converted to a unified frame rate.
  • EEE 7 The imaging system of any of the EEEs 1-6, wherein said logic circuit decides which pixel digital values to upload based on degree of movement estimated by the stored pixel digital values of the at least two exposures of different capture duration.
  • EEE 8 The imaging system of any of the EEEs 1-6, wherein said logic circuit decides which pixel digital values to upload based on expected eye movement estimated by a location of the stored pixel digital value within the video frame and the stored pixel digital values of the at least two exposures of different capture duration.
  • EEE 9 The imaging system of any of the EEEs 1-6, wherein said logic circuit decides which pixel digital values to upload based on measured eye movement.
  • EEE 10 The imaging system of any of the EEEs 1-9, wherein said capture duration is variable based upon an exposure overflow.
  • EEE 11 An encoder comprising the imaging system of any of the EEEs 1-10.
  • EEE 12 The imaging system of any of the EEEs 1-10, wherein the upload to said video frame is at least one of rotated, shifted and scrambled and wherein a temporal offset matrix describing the video frame upload is embedded into said video frame.
  • EEE 13 The imaging system of any of the EEEs 1-10 or EEE 12, wherein said logic correlates said stored pixel digital values to a frame location to pool local pixel digital values and adjusts a noise threshold for a subsequent video frame.
  • a method of image processing comprising:
  • EEE 15 The method of EEE 14, wherein selection is based on degree of movement estimated by the first pixel digital value and the second pixel digital value.
  • EEE 16 The method of EEE 14, wherein selection is based on expected eye movement estimated by a location of the stored pixel digital value within the video frame and the first pixel digital value and the second pixel digital value.
  • EEE 17 The method of EEE 14, wherein selection is based on measured eye movement.
  • EEE 18 An encoder comprising the method of any of the EEEs 14-17.
  • EEE 19 The method of any of the EEEs 14-17, wherein the upload to said video frame is at least one of rotated, shifted and scrambled and wherein a temporal offset matrix describing the video frame upload is embedded into said video frame.
  • EEE 20 The method of any of the EEEs 14-19, wherein said selection correlates said stored pixel digital values to a frame location to pool local pixel digital values and adjust a noise threshold for a subsequent video frame.
  • EEE 21 A method of encoding an image, comprising:
  • EEE 22 The method of EEE 21, wherein said second pixel digital value is selected for video frame upload if said pixel digital value is greater than said pre-determined noise value.
  • EEE 23 The method of EEE 21, wherein said second pixel digital value is selected for video frame upload based on a movement estimation from the pixel digital delta.
  • EEE 24 The method of EEE 23, further comprising compressing a sub-frame based on said movement estimation.
  • EEE 25 The method of EEE 21, wherein said second pixel digital value is selected for video frame upload based on an expected eye movement estimated by a location of the pixel within the video frame and the pixel digital delta.
  • EEE 26 The method of EEE 25, further comprising compressing a sub-frame based on said expected eye movement.
  • EEE 27 The method of EEE 21 wherein said second pixel digital value is selected for video frame upload based on a measured eye movement.
  • EEE 29 The method of any of the EEEs 21-28, further comprising integrating the video frame based upon a capture duration of said pixel.
  • EEE 30 The method of any of the EEEs 21-29, wherein the upload to said video frame is at least one of rotated, shifted and scrambled and wherein a temporal offset matrix describing the video frame upload is embedded into said video frame.
  • EEE 31 The method of any of the EEEs 21-30, wherein said selection correlates at least one of said first pixel digital value and said second pixel digital value to a frame location to pool local pixel digital values and adjust a noise threshold for a subsequent video frame.
  • a method of encoding comprising:
  • EEE 33 The method of encoding of EEE 32, further comprising storing the first pixel digital value and the second pixel digital value.
  • EEE 34 The method of encoding of EEE 32 or EEE 33, wherein said smooth pursuit vector is based on a degree of movement estimated by the first pixel digital value and the second pixel digital value.
  • EEE 35 The method of encoding of any of the EEEs 32-34 wherein said smooth pursuit vector is based on a location of the pixel within said video frame and on a degree of movement estimated by the first pixel digital value and the second pixel digital value.
  • EEE 36 The method of encoding of any of the EEEs 32-35, wherein said smooth pursuit vector is based on a measured eye movement.
  • EEE 37 The method of encoding of any of the EEEs 32-36, wherein said smooth pursuit vector is based on a correlation of a degree of movement estimated by the first pixel digital value and the second pixel digital value and a location of the pixel within said video frame.
  • EEE 38 The method of encoding of any of the EEEs 32-37, wherein the upload to said video frame is at least one of rotated, shifted and scrambled and wherein a temporal offset matrix describing the video frame upload is embedded into said video frame.
  • EEE 39 The method of encoding of any of the EEEs 32-38, wherein said selection correlates at least one of said first pixel digital value and said second pixel digital value to a frame location to pool local pixel digital values and adjust a noise threshold for a subsequent video frame.
  • a method of encoding comprising:
  • EEE 41 The method of encoding of EEE 40, further comprising storing the first pixel digital value and the second pixel digital value.
  • EEE 42 The method of encoding of EEE 40 or EEE 41, wherein the upload to said video frame is at least one of rotated, shifted and scrambled and wherein a temporal offset matrix describing the video frame upload is embedded into said video frame.
  • EEE 43 The method of encoding of any of the EEEs 40-42, wherein said selection correlates at least one of said first pixel digital value and said second pixel digital value to a frame location to pool local pixel digital values and adjust a noise threshold for a subsequent video frame.
  • EEE 45 The method of encoding of EEE 44, further comprising storing the first pixel digital value and the second pixel digital value.
  • EEE 46 The method of encoding of EEE 44 or EEE 45, wherein the upload to said video frame is at least one of rotated, shifted and scrambled and wherein a temporal offset matrix describing the video frame upload is embedded into said video frame.
  • EEE 47 The method of encoding of any of the EEEs 44-46, wherein said selection correlates at least one of said first pixel digital value and said second pixel digital value to a frame location to pool local pixel digital values and adjust a noise threshold for a subsequent video frame.
  • EEE 48 The method of encoding of any of the EEEs 44-47, further comprising:
  • EEE 49 A method of decoding a temporally offset frame, the method comprising:
  • EEE 50 The method of decoding of EEE 49, further comprising repeating the method for channels having similar frame sequential colors.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Studio Devices (AREA)

Abstract

L'invention concerne un système d'imagerie comprenant une matrice de capteurs d'image à pixels disposée sur un substrat et comportant une pluralité de pixels, une minuterie à étages multiples couplée à ladite matrice de capteurs d'image à pixels pour déclencher des expositions desdits pixels, les pixels étant groupés en N sous-ensembles, et la minuterie déclenchant, pour chaque sous-ensemble, une séquence d'au moins deux expositions de durée de capture différente des pixels dudit sous-ensemble, les instants de début des séquences d'exposition des différents sous-ensembles étant décalés dans le temps, au moins un pixel ADC étant couplé à ladite matrice de capteurs d'image, qui convertit ces expositions de ces pixels de valeurs numériques de pixels, une mémoire étant couplée audit ADC pour stocker lesdites valeurs numériques de pixels, et un circuit logique étant couplé à ladite mémoire pour déterminer pour chaque pixel de la matrice de capteurs d'image, les valeurs numériques de pixels correspondantes stockées à charger dans une image vidéo.
PCT/US2017/018046 2016-02-22 2017-02-16 Appareil et procédé de codage d'un contenu à cadence élevée en vidéo à fréquence de trame standard en utilisant un entrelacement temporel Ceased WO2017146972A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/078,490 US10991281B2 (en) 2016-02-22 2017-02-16 Apparatus and method for encoding high frame rate content in standard frame rate video using temporal interlacing

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US201662298085P 2016-02-22 2016-02-22
EP16156756.5 2016-02-22
EP16156756 2016-02-22
US62/298,085 2016-02-22
US201762449804P 2017-01-24 2017-01-24
US62/449,804 2017-01-24

Publications (1)

Publication Number Publication Date
WO2017146972A1 true WO2017146972A1 (fr) 2017-08-31

Family

ID=55486482

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/018046 Ceased WO2017146972A1 (fr) 2016-02-22 2017-02-16 Appareil et procédé de codage d'un contenu à cadence élevée en vidéo à fréquence de trame standard en utilisant un entrelacement temporel

Country Status (1)

Country Link
WO (1) WO2017146972A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019067762A1 (fr) 2017-09-28 2019-04-04 Dolby Laboratories Licensing Corporation Métadonnées de conversion de fréquence de trame
WO2020055907A1 (fr) 2018-09-12 2020-03-19 Dolby Laboratories Licensing Corporation Architecture de capteur cmos pour échantillonnage décalé dans le temps
WO2020106559A1 (fr) * 2018-11-19 2020-05-28 Dolby Laboratories Licensing Corporation Codeur vidéo et procédé de codage
CN114845111A (zh) * 2019-03-11 2022-08-02 杜比实验室特许公司 帧速率可伸缩视频编码
CN115118974A (zh) * 2022-06-22 2022-09-27 清华大学 视频生成方法、装置、系统、电子设备以及可读存储介质
CN115428036A (zh) * 2020-05-04 2022-12-02 安定宝公司 用于以高分辨率对图像序列中的包含感兴趣元素的区域进行编码的系统和方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040069928A1 (en) * 2002-10-15 2004-04-15 Sagatelyan Dmitry M. System and methods for dynamic range extension using variable length integration time sampling
WO2008138543A1 (fr) * 2007-05-10 2008-11-20 Isis Innovation Limited Dispositif et procédé de capture d'image
US8958649B2 (en) * 2013-03-13 2015-02-17 Wisconsin Alumni Research Foundation Video generation with temporally-offset sampling
US20150070569A1 (en) * 2013-09-09 2015-03-12 Broadcom Corporation Enhanced Dynamic Range Image Processing
US20150207974A1 (en) * 2014-01-17 2015-07-23 Texas Instruments Incorporated Methods and apparatus to generate wide dynamic range images

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040069928A1 (en) * 2002-10-15 2004-04-15 Sagatelyan Dmitry M. System and methods for dynamic range extension using variable length integration time sampling
WO2008138543A1 (fr) * 2007-05-10 2008-11-20 Isis Innovation Limited Dispositif et procédé de capture d'image
US8958649B2 (en) * 2013-03-13 2015-02-17 Wisconsin Alumni Research Foundation Video generation with temporally-offset sampling
US20150070569A1 (en) * 2013-09-09 2015-03-12 Broadcom Corporation Enhanced Dynamic Range Image Processing
US20150207974A1 (en) * 2014-01-17 2015-07-23 Texas Instruments Incorporated Methods and apparatus to generate wide dynamic range images

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MODY MIHIR ET AL: "Flexible Wide Dynamic Range (WDR) processing support in image signal processor (ISP)", 2015 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), IEEE, 9 January 2015 (2015-01-09), pages 467 - 470, XP032749825, DOI: 10.1109/ICCE.2015.7066488 *
S. SUGAWA ET AL.: "A 100dB Dynamic Range CMOS Image Sensor Using a Lateral Overflow integration capacitor", ISSCC DIG. TECH. PAPER, February 2005 (2005-02-01), pages 352

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11019302B2 (en) 2017-09-28 2021-05-25 Dolby Laboratories Licensing Corporation Frame rate conversion metadata
WO2019067762A1 (fr) 2017-09-28 2019-04-04 Dolby Laboratories Licensing Corporation Métadonnées de conversion de fréquence de trame
CN112823510B (zh) * 2018-09-12 2022-02-18 杜比实验室特许公司 用于时间递色取样的cmos传感器架构
CN112823510A (zh) * 2018-09-12 2021-05-18 杜比实验室特许公司 用于时间递色取样的cmos传感器架构
WO2020055907A1 (fr) 2018-09-12 2020-03-19 Dolby Laboratories Licensing Corporation Architecture de capteur cmos pour échantillonnage décalé dans le temps
US11323643B2 (en) 2018-09-12 2022-05-03 Dolby Laboratories Licensing Corporation CMOS sensor architecture for temporal dithered sampling
WO2020106559A1 (fr) * 2018-11-19 2020-05-28 Dolby Laboratories Licensing Corporation Codeur vidéo et procédé de codage
JP2022508057A (ja) * 2018-11-19 2022-01-19 ドルビー ラボラトリーズ ライセンシング コーポレイション ビデオエンコーダおよび符号化方法
JP7143012B2 (ja) 2018-11-19 2022-09-28 ドルビー ラボラトリーズ ライセンシング コーポレイション ビデオエンコーダおよび符号化方法
US11876987B2 (en) 2018-11-19 2024-01-16 Dolby Laboratories Licensing Corporation Video encoder and encoding method
CN114845111A (zh) * 2019-03-11 2022-08-02 杜比实验室特许公司 帧速率可伸缩视频编码
CN115428036A (zh) * 2020-05-04 2022-12-02 安定宝公司 用于以高分辨率对图像序列中的包含感兴趣元素的区域进行编码的系统和方法
CN115118974A (zh) * 2022-06-22 2022-09-27 清华大学 视频生成方法、装置、系统、电子设备以及可读存储介质
WO2023246041A1 (fr) * 2022-06-22 2023-12-28 清华大学 Procédé, appareil et système de génération de vidéo, dispositif électronique et support de stockage lisible

Similar Documents

Publication Publication Date Title
US10991281B2 (en) Apparatus and method for encoding high frame rate content in standard frame rate video using temporal interlacing
WO2017146972A1 (fr) Appareil et procédé de codage d'un contenu à cadence élevée en vidéo à fréquence de trame standard en utilisant un entrelacement temporel
JP5726057B2 (ja) シーンの一連のフレームをビデオとして取得するカメラおよびその方法
US8159579B2 (en) High dynamic range video
Gu et al. Coded rolling shutter photography: Flexible space-time sampling
US8749646B2 (en) Image processing apparatus, imaging apparatus, solid-state imaging device, image processing method and program
US10091434B2 (en) System and method for capturing digital images using multiple short exposures
KR102314703B1 (ko) 이미지 처리를 위한 조인트 딕셔너리 생성 방법, 그 조인트 딕셔너리들을 이용한 인터레이스 기반 하이 다이나믹 레인지 이미징 장치 및 그 이미지 처리 방법
US8890983B2 (en) Tone mapping for low-light video frame enhancement
EP3884673B1 (fr) Codeur vidéo et procédé de codage
JP2012257193A (ja) 画像処理装置、撮像装置、および画像処理方法、並びにプログラム
CN105323498A (zh) 不含运动伪影的高动态范围hdr图像
KR20150045877A (ko) 영상 처리 장치 및 영상 처리 방법
CN104580940A (zh) 摄像系统、摄像装置、编码装置以及摄像方法
US7499081B2 (en) Digital video imaging devices and methods of processing image data of different moments in time
Kim et al. Color interpolation algorithm for the Sony-RGBW color filter array
Cho et al. Alternating line high dynamic range imaging
Keinert et al. High-dynamic range video cameras based on single shot non-regular sampling

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17706953

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17706953

Country of ref document: EP

Kind code of ref document: A1