WO2025201947A1 - Circuitry and method - Google Patents
Circuitry and methodInfo
- Publication number
- WO2025201947A1 WO2025201947A1 PCT/EP2025/057234 EP2025057234W WO2025201947A1 WO 2025201947 A1 WO2025201947 A1 WO 2025201947A1 EP 2025057234 W EP2025057234 W EP 2025057234W WO 2025201947 A1 WO2025201947 A1 WO 2025201947A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image data
- phases
- output
- circuitry
- compressed image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/95—Computational photography systems, e.g. light-field imaging systems
- H04N23/951—Computational photography systems, e.g. light-field imaging systems by using two or more images to influence resolution, frame rate or aspect ratio
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4053—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
- G06T3/4069—Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution by subpixel displacements
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/40—Extracting pixel data from image sensors by controlling scanning circuits, e.g. by modifying the number of pixels sampled or to be sampled
- H04N25/46—Extracting pixel data from image sensors by controlling scanning circuits, e.g. by modifying the number of pixels sampled or to be sampled by combining or binning pixels
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/48—Increasing resolution by shifting the sensor relative to the scene
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N25/00—Circuitry of solid-state image sensors [SSIS]; Control thereof
- H04N25/70—SSIS architectures; Circuits associated therewith
- H04N25/76—Addressed sensors, e.g. MOS or CMOS sensors
- H04N25/77—Pixel circuitry, e.g. memories, A/D converters, pixel amplifiers, shared circuits or shared components
- H04N25/772—Pixel circuitry, e.g. memories, A/D converters, pixel amplifiers, shared circuits or shared components comprising A/D, V/T, V/F, I/T or I/F converters
- H04N25/773—Pixel circuitry, e.g. memories, A/D converters, pixel amplifiers, shared circuits or shared components comprising A/D, V/T, V/F, I/T or I/F converters comprising photon counting circuits, e.g. single photon detection [SPD] or single photon avalanche diodes [SPAD]
-
- H—ELECTRICITY
- H10—SEMICONDUCTOR DEVICES; ELECTRIC SOLID-STATE DEVICES NOT OTHERWISE PROVIDED FOR
- H10F—INORGANIC SEMICONDUCTOR DEVICES SENSITIVE TO INFRARED RADIATION, LIGHT, ELECTROMAGNETIC RADIATION OF SHORTER WAVELENGTH OR CORPUSCULAR RADIATION
- H10F30/00—Individual radiation-sensitive semiconductor devices in which radiation controls the flow of current through the devices, e.g. photodetectors
- H10F30/20—Individual radiation-sensitive semiconductor devices in which radiation controls the flow of current through the devices, e.g. photodetectors the devices having potential barriers, e.g. phototransistors
- H10F30/21—Individual radiation-sensitive semiconductor devices in which radiation controls the flow of current through the devices, e.g. photodetectors the devices having potential barriers, e.g. phototransistors the devices being sensitive to infrared, visible or ultraviolet radiation
- H10F30/22—Individual radiation-sensitive semiconductor devices in which radiation controls the flow of current through the devices, e.g. photodetectors the devices having potential barriers, e.g. phototransistors the devices being sensitive to infrared, visible or ultraviolet radiation the devices having only one potential barrier, e.g. photodiodes
- H10F30/225—Individual radiation-sensitive semiconductor devices in which radiation controls the flow of current through the devices, e.g. photodetectors the devices having potential barriers, e.g. phototransistors the devices being sensitive to infrared, visible or ultraviolet radiation the devices having only one potential barrier, e.g. photodiodes the potential barrier working in avalanche mode, e.g. avalanche photodiodes
Definitions
- BI bust imaging
- BI includes acquiring a plurality of image frames with a short exposure and merging the plurality of image frames (which may also be known as micro-frames) into an image frame such that a motion blur in the resultant image frame is reduced as compared to an image obtained by one-shot imaging with a long exposure.
- the disclosure provides circuitry that is configured to: obtain, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
- the disclosure provides a method that includes: obtaining, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associating, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data; and generating output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
- Fig.1 illustrates an embodiment of an electronic device with a circuitry
- Fig.2 illustrates an embodiment of a method performed by a circuitry
- Fig.3 illustrates embodiments of binning photosensitive elements
- Fig.4 illustrates an embodiment of a pixel binning in a first phase of shifted-pixel binning in a color array
- Fig.5 illustrates an embodiment of a pixel binning in a second phase of shifted-pixel binning in a color array
- Fig.6 illustrates an embodiment of a pixel binning in a third phase of shifted-pixel binning in a color array
- Fig.7 illustrates an embodiment of a pixel binning in a fourth phase of shifted-pixel binning in a color array
- Fig.8 illustrates an embodiment of
- BI bust imaging
- BI includes acquiring a plurality of image frames with a short exposure (e.g., 1/250 seconds, without limiting the disclosure to this value) and (aligning and) merging the plurality of image frames (which may also be known as micro-frames) into an image frame such that a motion blur in the image is reduced as compared to an image obtained by one-shot imaging with a long exposure (that may, for example, correspond to a sum of the short exposures of the plurality of image frames, e.g., 1/5 seconds, without limiting the disclosure to this value).
- BI may be used in photography and in video recordings.
- one-shot imaging when a camera/scene moves while a exposure time is relatively long (which may be needed, e.g., in low-light scenes), the movement may result in motion-blurred images.
- a quality of the resulting images may be degraded for end users and/or for artificial intelligence (AI) features extraction (e.g., an AI may need a minimum signal- to-noise ratio (SNR) and/or image quality (e.g., XY definition, sharpness, etc.) to perform object detection/recognition).
- SNR signal- to-noise ratio
- image quality e.g., XY definition, sharpness, etc.
- burst imaging e.g., quanta burst imaging (QBI)
- multiple images may be acquired using a shorter exposure time.
- each image of the burst may include less motion-blur but also less SNR.
- a tangential velocity of the target of 1 m/s with respect to the camera may correspond to a blur of 5.13 pixels/ms of a resulting image (wherein the disclosure is not limited to these values). Therefore, reducing a motion blur may be reduced by reducing an exposure time.
- an image with lower motion-blur than one-shot imaging, but equivalent SNR may be obtained. For example, for illustration, a one-shot image may be acquired with a total exposure time of 1/5 s as reference.
- a burst of 50 image frames with a exposure time of 1/250 s each may be acquired, resulting in a same total exposure time of 1/5 s as for the reference one- shot image.
- a result of the burst imaging after aligning and merging the burst of 50 image frames of 1/250 s exposure time each may correspond to a total exposure time of 1/5 s (i.e., the same total exposure time as for the reference one-shot image), but may include a reduced motion blur.
- the technology is not limited to the values disclosed herein as exposure/shutter time and/or as number of image frames in a burst, and that any other suitable exposure/shutter time and/or number of image frames in a burst may be used instead.
- a sum of the exposures of the burst of image frames may correspond to a total exposure time of a one-shot image, but the technology is not limited thereto, and a sum of exposures of a burst of image frames may be different than a total exposure time of a reference one-shot image.
- Use cases of burst imaging may include obtaining a non-blurred image (or an image with a reduced motion blur), providing a higher frames-per-second (fps) value to a host, and/or generating a super-resolution image.
- CMOS complementary metal-oxide- semiconductor
- burst imaging with a conventional CiS may not achieve the same SNR as a one-shot image with the same total exposure time in some embodiments.
- a solution may involve using a sensor with a neglectable read-out noise, e.g., a sensor with quanta image capability (e.g., with a read out ⁇ 0.23 e- root mean square (rms)).
- the image sensor may connect all photosensitive elements of a bin to a single counter such that the counter may receive, via corresponding wiring, and count signals that may indicate a charge carrier avalanche caused by a photon in any photosensitive element (e.g., SPAD) of the bin.
- the image sensor may sum photon counts from several counters each of which may receive, via corresponding wiring, and count signals that may indicate a charge carrier avalanche caused by a photon in one photosensitive element or in a subset of photosensitive elements of a bin.
- the skilled person may find further ways of binning photosensitive elements.
- the binning patterns may be based on a predefined kernel.
- the kernel may indicate a size and shape of the bins.
- the kernel may indicate a size of 2x2 photosensitive elements, 3x3 photosensitive elements, 4x4 photosensitive elements, 1x2 photosensitive elements, 2x1 photosensitive elements, MxM photosensitive elements or MxN photosensitive elements (where M and N may be any suitable integers).
- the skilled person may find a suitable size for the kernel.
- the bins may have a same size according to a same kernel or may have different sizes according to different kernels.
- the bins may be shifted between subsequent phases of SPB according to the predefined binning patterns. For example, each bin may be shifted in a region of the array of photosensitive elements around a center-of-mass that may be associated with the bin.
- the bins may be shifted, between subsequent phases, by one row and/or by one column of the array of photosensitive elements.
- the bins may be shifted through the region of the array according to a predefined shifting pattern or randomly. Due to the shifting, the photon count acquired for a bin in the different phases may on average correspond to a photon count at the center-of-mass that is associated with the bin.
- the compressed image data may be compressed in that the image sensor may acquire a single photon count per bin and per image frame such that a number of acquired photon counts per image frame may be lower than a number of photosensitive elements of the image sensor.
- a compression ratio at which the compressed image data may be compressed may correspond to the kernel size of the kernel.
- a smaller amount of data may have to be transmitted from the image sensor to the circuitry, such that the circuitry may be able to obtain the compressed image data at a higher frame rate, and/or a power consumption of the image sensor and/or of the circuitry may be reduced.
- a size and/or complexity of the image sensor and/or circuitry may also be reduced in some embodiments.
- a number of counters that is required in a logic layer of the image sensor may be lower than the number of photosensitive elements of the image sensor, such that a larger logic-node technology may be used although a smaller pixel pitch may be realized.
- the circuitry may be configured as a host of the image sensor that receives the compressed image data from the image sensor.
- the obtaining of the compressed image data may include causing the image sensor to generate the compressed image data and receiving the compressed image data from the image sensor via the communication unit.
- the image sensor may acquire a plurality of image frames based on SPB (e.g., one image frame per phase of the SPB), as described above, wherein the image sensor may acquire one photon count per bin of photosensitive elements for each image frame.
- the image sensor may generate the compressed image data such that the compressed image data may include a value that is based on an acquired photon count for each bin of photosensitive elements and for each image frame.
- the value that is based on the acquired photon count may be configured as a Boolean value, as an integer value or as a floating-point value.
- a data type and range of the value is not limited to these examples, and the skilled person may find a suitable data type and range for the value.
- the value may represent one pixel of an image frame of the compressed image data.
- the image sensor may transmit the compressed image data as a stream, such that the circuitry may process earlier frames of the compressed image data before the image sensor has generated later frames of the compressed image data.
- the circuitry may accumulate a predefined number of subsequent image frames of the compressed image data and generate the output image data based on the accumulated subsequent image frames.
- the image sensor may acquire the image frames at a low exposure time, e.g., at a frame rate of 250 fps, 500 fps or 1000 fps (without limiting the disclosure to these values).
- the disclosure is not limited to upsampling the output image data to a resolution of the array of photosensitive elements.
- the circuitry may as well upsample the output image data to a resolution that may be lower or higher than the array of photosensitive elements, or may omit the upsampling and instead generate the output image data with a resolution that corresponds to the number of bins.
- the image sensor may, in some cases, image an object that moves with respect to the image sensor.
- the image sensor may be directed at a scene and may receive light from the scene (e.g., through a lens, a mirror or the like, which may focus the light on the image sensor).
- the image sensor may be moved (e.g., translated or rotated or distance change) with respect to the scene, or a lens or mirror may be moved (e.g., zoomed or panned) in front of the image sensor such that light from another portion of the scene may be focused on the image sensor.
- the circuitry may associate, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data.
- the portion may be represented by one or more pixels, by a set of pixels, by a wavelet, by an edge, by a corner and/or by any other suitable image feature in the compressed image data.
- the associating of the portion of the object may include matching (e.g., within a predefined tolerance or threshold) and/or identifying (representations of) the portion of the object in the compressed image data for (at least some of) the plurality of phases.
- the associating of the portion of the object may include determining/identifying, for (at least some of) the plurality of phases, portions in the compressed image data that correspond to (e.g., represent) the portion of the object.
- a number of pixels of the image frame of the image data may correspond to a product of a number of pixels per image frame of the compressed image data and a number of phases which compressed image data are included in the image data.
- the number of pixels of the image frame of the image data may correspond to a sum of numbers of pixels of all image frames of the compressed image data that are included in the image data.
- the image data may be configured as a pre-output array, in which pixels of the compressed image data are arranged (e.g., aligned) according to the determined association.
- the association (and, in some embodiments, an alignment based on the association) may be performed based on computer vision (CV; e.g.
- the compressed image data may represent a moving object at different positions for different phases (e.g., in different image frames), a motion blur caused by a movement of the object may be reduced by arranging (e.g., aligning) the pixels of the compressed image data according to the determined association.
- the deblurring of the (upsampled compressed) image data and/or the generating of the output image data is not limited to summing pixel values, but may include further image processing techniques such as adjusting a brightness and/or a color of one or more image frames and/or applying an image filter.
- the deblurring may be based on analytical methods (e.g., based on detecting edges of objects, wherein a edge may be determined as boundary between adjacent areas of different colors or textures, wherein a difference of colors or textures between the areas may exceed a predefined threshold) and/or on a machine learning method (e.g., an artificial neural network).
- the circuitry may associate, align and deblur each detected object (or portion thereof) separately from other (portions of) objects represented by the compressed image data.
- the output image data which are based on deblurring the image data, may represent the scene with a detailedness that corresponds to the number of photosensitive elements of the image sensor.
- the deblurring may correspond to a decompression that reconstructs a representation of the scene based on the compressed image data that have been compressed by SPB.
- the circuitry may use the compressed image data as the image data (that are based on the compressed image data and) that are deblurred.
- Acquiring the photon count from bins with different kernel sizes simultaneously may allow for associating the portion of the object at different levels of granularity. For example, the association may first be determined on a coarse level based on a larger kernel size, and then on a fine level based on a smaller kernel size, wherein the associating on the fine level may be based on bins with the smaller kernel size that overlap bins with the larger kernel size that have been associated with the portion of the object. Therefore, acquiring the photon count from bins with different kernel sizes simultaneously may allow to reduce an amount of processing needed for the associating, and thus may increase a performance (e.g., frame rate and/or quality of the output image data) and/or reduce a consumption of electric power and/or of computing resources.
- a performance e.g., frame rate and/or quality of the output image data
- Some embodiments pertain to circuitry that is configured to: obtain, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning; associate, for at least some of the plurality of shifted-pixel binning phases, a portion of an object that is represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of shifted-pixel binning phases, image data that are based on the compressed image data.
- the features described above with respect to the circuitry and/or to the method may be combined in any suitable way.
- Fig.1 illustrates an embodiment of an electronic device 1 with a circuitry 2.
- the communication unit 6 includes interfaces for communicating with other devices, including a Camera Serial Interface (CSI) for receiving compressed image data, and a Universal Serial Bus (USB) interface for transmitting output image data generated by the circuitry.
- the electronic device 1 further includes an image sensor 7 and a lens 8.
- the image sensor 7 includes a two-dimensional array 7a of single-photon avalanche diodes (SPADs), which are examples of photosensitive elements.
- the image sensor 7 also includes a logic layer 7b that is laminated to the array 7a of SPADs on a side that is opposite to a light-incident side of the array 7a of SPADs.
- the lens 8 focuses light from a scene on the array 7a of SPADs.
- the image sensor 7 is configured to generate compressed image data based on shifted-pixel binning (SPB), as described with respect to Fig.3 to 9, and to transmit the generated compressed image data via CSI to the circuitry 2.
- the circuitry 2 is configured to receive and process the compressed image data and to generate output image data according to any one of the methods described below with respect to Fig.10 to 18.
- Fig.2 illustrates an embodiment of a method 10 performed by the circuitry 2 of Fig.1.
- the control unit 3 determines a number of phases that is associated with a predefined image quality of the output image data.
- the control unit 3 causes the image sensor 7 to acquire the photon count for the determined number of phases.
- the image sensor 7 generates compressed image data based on shifted-pixel binning (SPB).
- the generating of the output image data includes upsampling, at 21, the compressed image data.
- the circuitry By the upsampling at 21, the circuitry generates image data that include a pre-output array.
- the generating of the output image data further includes deblurring, at 22, the image data, which have been obtained at 21 based on the compressed image data, based on the association of the portion of the object.
- the compressed image data, which are inputted to the artificial neural network 18, are associated with a predefined number of phases; and the artificial neural network 18 generates the output image data based on the inputted compressed image data, as indicated by an arrow in Fig.2.
- Cathodes of the SPADs 31 and 32 are connected by a switch 35 such that photon detection signals from any one of the SPADs 31 and 32 is received by a NOT gate 36, which is connected to the cathode of the SPAD 32.
- An output from the NOT gate 36 is provided to a counter 37.
- the counter 37 counts photon detection signals and outputs a signal that indicates a photon count to a scanning circuit.
- a second embodiment 40 is based on pulse summing.
- Two SPADs 41 and 42 (which are examples of photosensitive elements) are each connected to a respective resistor 43 and 44.
- Cathodes of both SPADs 41 and 42 are connected to a respective NOT gate 45 and 46.
- the green SPADs G’34 and G’44 are binned with further SPADs (not shown in Fig.5), from which the image sensor 7 acquires a photon count 66a.
- Fig.6 illustrates an embodiment of a pixel binning in a third phase (Phase #3) of SPB in the color array 60.
- Phase #3 the green SPADs G’ 22 , G’ 23 , G’ 32 and G’ 33 are binned, from which the image sensor 7 acquires a photon count 67.
- Fig.7 illustrates an embodiment of a pixel binning in a fourth phase (Phase #4) of SPB in the color array 60.
- Fig.9 illustrates an embodiment of SPB in a grayscale array 80 and generating an output high- resolution grayscale image 83 based on SPB.
- the grayscale array 80 is an example of the array 7a of SPADs of Fig.1.
- the grayscale array 80 includes a plurality of SPADs, which are examples of photosensitive elements, and which are capable of detecting light of any color (wavelength) within a predefined range (in the case of Fig.9, within the visible range).
- the circuitry 2 obtains the compressed image data from the image sensor 7 and merges the compressed image data into a pre-output array 81, wherein pixel values of the pre-output array 81 are based on the photon counts from corresponding bins, as indicated by arrows in Fig.9.
- the circuitry 2 merges (which is an example of the upsampling at 21 of Fig.2) the aligned frames 100 to 103 to a pre-output image 105 (which is an example of the pre-output array 70 of Fig.8 and of the pre-output array 81 of Fig.9).
- the pre-output image 105 is still blurred (similar to being out-of-focus), as illustrated by dotted contours of the pre-output image 105. Since the aligned frame 104 is associated with another SPB cycle, the aligned frame 104 is merged with aligned frames of the other SPB cycle.
- the circuity 2 then generates an output image 106 (which is an example of output image data).
- an aligned frame 113 the objects form the image frame 95 are aligned to their positions in the image frame 97, which has been acquired in Phase #4.
- aligned frames 114 to 117 the objects from the image frame 96 are aligned to their positions in the image frames 95 to 99, respectively.
- aligned frames 118 to 121 the objects from the image frame 97 are aligned to their positions in the image frames 95 to 99, respectively.
- aligned frames 122 to 125 the objects from the image frame 98 are aligned to their positions in the image frames 95 to 99, respectively.
- the objects from the scene in the image frames 95 to 98 of the different phases are aligned to all phases of the SPB cycle.
- the circuitry 2 Since the SPB cycle in Fig.11 includes four phases, the circuitry 2 generates four aligned image frames 110 to 125, respectively, from each image frame 95 to 98.
- Fig.12 illustrates an embodiment of generating a plurality of output images from a shifted-pixel binning cycle.
- the circuitry 2 merges the aligned image frames 110 to 125 into four pre-output images 126 to 129 according to their associated phases.
- the merging of the aligned image frames 110 to 125 is an example of the upsampling at 21 of Fig.2, and the pre-output images 126 to 129 are examples of the pre-output array 70 of Fig.8 and of the pre-output array 81 of Fig.9.
- the pre-output image 126 corresponds to Phase #1 and is based on merging the aligned image frames 110, 114, 118 and 122.
- the pre-output image 127 corresponds to Phase #2 and is based on merging the aligned image frames 111, 115, 119 and 123.
- the pre-output image 128 corresponds to Phase #3 and is based on merging the aligned image frames 112, 116, 120 and 124.
- the pre-output image 129 corresponds to Phase #4 and is based on merging the aligned image frames 113, 117, 121 and 125.
- the output image 131 corresponds to Phase #2 and is based on deblurring the pre-output image 127.
- the output image 132 corresponds to Phase #3 and is based on deblurring the pre-output image 128.
- the output image 133 corresponds to Phase #4 and is based on deblurring the pre-output image 129.
- the embodiment of Fig.11 and 12 may, in some cases, introduce a delay in an output due to an amount of processing, but may generate more fps (i.e., a higher frame rate) of an output movie.
- Fig.13 illustrates an embodiment of a neural network (NN) pyramidal workflow of SPB QBI based on step window processing.
- input frames are recorded using a QBI array (e.g., the array 7a of Fig.1) with on-sensor SPB compression capabilities.
- a QBI array e.g., the array 7a of Fig.1
- objects may be in movement.
- the upsampling 153 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 154.
- the alignment NN 154 further receives the image frame #4n+3 as a reference, and provides its output to a mask 155.
- the mask 155 is performed if the input frames are SPB-compressed, and provides its output to a sum 156.
- the image frame #4n+1 corresponds to Phase #2 and is inputted to an upsampling 157.
- the upsampling 157 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 158.
- the alignment NN 158 further receives the image frame #4n+3 as a reference, and provides its output to a mask 159.
- the mask 159 is performed if the input frames are SPB-compressed, and provides its output to the sum 156.
- the image frame #4n+2 corresponds to Phase #3 and is inputted to an upsampling 160.
- the upsampling 160 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 161.
- the alignment NN 161 further receives the image frame #4n+3 as a reference, and provides its output to a mask 162.
- the mask 162 is performed if the input frames are SPB-compressed, and provides its output to the sum 156.
- the image frame #4n+3 is further inputted to an upsampling 163.
- the upsampling 163 is performed if the input frames are SPB-compressed, and provides its output as a reference to a mask 164.
- the mask 164 is performed if the input frames are SPB-compressed, and provides its output to the sum 156.
- the sum 156 adds its inputs and provides its result to a sum 165.
- the sum 165 further receives a previous output frame aligned to the input frame 4n+3 from Phase #4, and adds its inputs.
- the sum 165 provides its result to a decompression & enhancement NN 166.
- the decompression & enhancement NN 166 is performed if the input frames are SPB- compressed, and decompressed and enhances its input.
- the decompression performed by the decompression & enhancement NN 166 includes generating a (deblurred) image by decompressing the SPB-compression of the input frames.
- Examples of an enhancement performed by the decompression & enhancement NN 166 include alignment corrections, denoising, SPB-decompression, etc.
- the decompression & enhancement NN 166 generates an enhanced frame.
- a feedback loop provides the enhanced frame as a previous output frame to the alignment NN 152 for a subsequent iteration of the processing of Fig.15 (e.g., for a subsequent SPB cycle).
- the alignment NN 152 generates, as its output, the previous output frame aligned to the input frame #4n+3 from Phase #4, and provides its output to the sum 165.
- Both color/RGB and monochrome/grayscale data are supported via loading and preprocessing 191.
- the pipeline 190 has two blocks.
- a first block 192 multiple input images are first aligned 193 and an optical flow is obtained for each input.
- each input is warped 194 to a latest frame according to the optical flow (the input may also be warped to any other frame, e.g., to a first frame). All the warped images are summed 195 together.
- a mask that is related to the CR pattern (shifted-pixel binning around a pivoting pixel) is used for the sum.
- each input is warped 204 to a latest frame according to the optical flow (the input may also be warped to any other frame, e.g., to a first frame). All the warped images are summed together.
- a mask that is related to the CR pattern (shifted- pixel binning around a pivoting pixel) is used for the sum.
- a second block 206 is only for CR mode, where a decompressor is used to recover HR.
- a denoiser is used to improve an image quality.
- An enhancement NN (#3) receives mosaic inputs and generates RGB output, i.e., it supports demosaic.
- an output 208 of the pipeline 200 may be used as a feedback image in order to continue improving the output 208.
- different datasets may be used to train different neural networks.
- Specific datasets may be generated to train different neural networks in different pipelines, e.g., alignment net, decompressor, and enhancement.
- the key point may be to make sure that a training dataset and realistic testing data have similar characteristics, e.g., a similar distribution.
- a ground truth of the datasets may contain images that also show moving objects, in order to include those characteristics in the training dataset.
- a Quantum sensor may work in very low light considering its zero readout noise, i.e., it may have Poisson noise.
- datasets are generated following a Poisson distribution.
- a wide range of Poisson variances is used to train multiple light conditions, since the lower the light, the higher the noise level may be.
- Different models are trained with different datasets for different pipeline and modes.
- a decompressor e.g., the decompressor 186 or 206
- an enhancement of CR mode e.g., the enhancement NN 187, 197 or 207
- a data sequence is compressed using SPB, and the compressed data sequence is passed through the align-warp- mask_sum pipeline 182, 192 or 202, respectively, and a dataset is obtained that has the artifacts (i.e., lower resolution and out-of-focus result) of CR.
- an enhancement of HR mode e.g., the enhancement NN 197 or 207
- Poisson noise is first added to ground-truth images, then color is converted to monochrome if white-black mode is used.
- a data sequence is passed through the align-warp-sum pipeline 192 or 202, respectively, and a dataset of HR artifacts is obtained.
- Poisson noise is first added to ground-truth images, then color is converted to monochrome if white-black mode is used. Then a data sequence is subsampled, and the processed sequence is passed through the align-warp-sum pipeline 192 or 202, respectively, and a dataset of LR artifacts is obtained.
- the align-warp-sum block 182, 192 and 202 respectively, the following is noted.
- An alignment net is used to obtain a pixel-wise movement between two adjacent frames. Given two images, ⁇ ⁇ and ⁇ ⁇ , the alignment net estimates a motion of a pixel point from ⁇ ⁇ to ⁇ ⁇ .
- the output flow contains a movement vector of each pixel in horizontal and vertical directions.
- noisy ⁇ ⁇ and ⁇ ⁇ are first created, and the optical flow ⁇ is used as the ground-truth. It is noted that a range of levels may be wide to suit complex realistic scenes.
- the objective frame is mapped to a template frame. The mapping is realized via warp, where interpolation is used to make the mapping finer.
- the warped objective image is summed together with the template image to enhance an image quality.
- the decompressor block 186 and 206 respectively, the following is noted.
- the decompressor is used to recover a HR image from a CR sequence.
- the CR sequence is compressed from an HR sequence.
- a CR image ⁇ ⁇ may have a higher SNR but lower modulation transfer function (MTF) compared to an HR image ⁇ ⁇ .
- MTF modulation transfer function
- the decompressor includes a U- loss function is designed to improve a contrast and maintain an SNR. To make sure that an estimated image is close to the ground-truth, a mean squared error (MSE) is used as data fidelity term in the loss. To obtain a high contrast, a combination of a structural similarity index measure (SSIM) is introduced.
- MSE mean squared error
- SSIM structural similarity index measure
- the enhancement NN block 187, 197 and 207 are introduced in the pipelines.
- the enhancement NN 187, 197 and 207 is trained by inputting low SNR inputs and a high SNR ground-truth.
- HR or CR artifacts may be added to the training dataset to improve the training performance.
- the enhancement NN 187, 197 and 207 includes a U-net like net. A specific loss function is designed to improve both high SNR and contrast.
- peak signal-to-noise ratio is introduced in the loss function to make sure that high SNR outputs are obtained.
- a combination of SSIM and sober-based edge loss functions is introduced.
- the outputs of the enhancement NN 187, 197 and 207 are fed back to a next level alignment in order to continue enhancing an image quality.
- an output of the enhancement NN 187, 197 and 207 may have similar characteristics to the inputs of the alignment.
- a similarity loss is introduced, which is a L2-norm. It is noted that that an L2-norm is chosen here to maintain the SNR.
- bit-counter e.g. 8 bits/counter
- photosensitive elements that belong to a same color channel (pixel binning in each color-filter channel) are grouped/binned (e.g., 2x2 for illustration purposes, without limiting the disclosure thereto) to share a bit-counter per group of pixels (photosensitive elements) (e.g., a RGGB color filter may require four separated groups of pixels and, thus, four independent bit-counters, one per color-filter).
- a bit-counter per pixel-binning may be provided.
- a compression kernel size in a pixel binning may be selected depending on a compression required, e.g., 2x2, 3x3, 4x4, 5x5, etc., without limiting the disclosure to these values.
- a compression required e.g. 2x2, 3x3, 4x4, 5x5, etc.
- different image frames (phases) may be acquired, wherein a location of the compression kernel location may be shifted in each phase, e.g., using a pivoting pixel as a reference.
- each group of phases e.g., 2x2 phases, without limiting the disclosure to this value
- as many frames may be recorded in the burst as many are necessary to achieve a final output quality (e.g., SNR, sharpness, etc.).
- a location of a reference pixel (or of reference pixel values) of each phase SPB-group may be registered (situated) in a corresponding location in a pre-output array (which may have an original resolution, e.g., native sensor resolution).
- the locations of the pixel values of each phase may correspond to locations of the shifted compression kernels (SPB).
- a displacement (motion estimation) of the objects in the scene between phases may be obtained and then situated/located (registered) in the pre-output array, also considering a compression kernel position in each phase.
- each pixel-value of each phase only contributes to one single pixel of the pre-output array.
- the pre-output array image (which may have an original resolution, i.e., non-compressed image size) may differ from an original (non-compressed) image, e.g., it may seem blurred/out-of-focus due to the SPB compression.
- the SPB compressed image may be de-compressed.
- the de-compression may be based on applying analytical solutions, e.g., based on a convolution on the pre-output image using an inverted compression kernel (decompression kernel), that may allow to recover the original image.
- the de-compression may also be based on applying neural-network based solutions, e.g., making use of the known SPB kernel, a dataset may be prepared for training a NN, which may be capable to recover the de-compressed image without blur/out-of-focus.
- each sensor read out may provide several images with different SPB-compression ratios (e.g., 3x3, 5x5, 7x7, etc., without limiting the disclosure to these values) simultaneously.
- This hardware feature may provide an output that may correspond with an image pyramid, that may be used to speed up the NN processing, e.g., simplifying a number of convolutions needed for object alignment between frames.
- an output of a QBI NN that uses a compressed resolution of 2x2 SPB achieves a same image quality as when using a non-compressed resolution. Accordingly, some embodiments provide an on-sensor lossless hardware compression technique that may allow to reduce a data-rate from an image sensor to a host, such that a larger resolution array may be possible with a same sensor-host communication technology channel (e.g., MIPI).
- MIPI sensor-host communication technology
- the technique may further allow to reduce a number of bit-counters necessary in an array of photosensitive elements of the image sensor, thus avoiding a logic-layer to be a limiting factor for a final stack sensor.
- a pixel pitch of the sensor array may be reduced, and the necessary number of bit-counters may still fit under the array of photosensitive elements.
- a decompression according to the technique may be simply implemented in an existing NN without a significant computational cost impact.
- QBI and QBI with SPB-compressed data may be applied based on two main approaches (i.e., based on an artificial NN or based on an analytical solution).
- a SPB-compression hardware implementation may be very complex in a non- SPAD based pixel array because of a complexity of designing/fabricating required hardware multiple-connections while keeping a signal integrity. Using a SPAD-based pixel array may thus facilitate the SPB-compression hardware implementation.
- the SPB-compression hardware implementation may be simpler than compressive sensing techniques that may require more complex interconnexions, wherein these interconnexions may even need to be adapted to each scene property.
- SPB-compression is especially suitable for QBI in some embodiments. In QBI, several images may be acquired to obtain a good quality image output. SPB-compression may use phases (images) to recover an original resolution.
- Fig.19 illustrates an embodiment of a general-purpose computer 250.
- the general-purpose computer 250 can be implemented such that it can basically function as any type of electronic device (e.g., the electronic device 1 of Fig.1), for example, a camera, a smartphone, smart glasses, a head-mounted display, a smartwatch, a mobile phone, a mobile tablet, a notebook, a terminal device or the like.
- the computer 250 has a CPU 251 (Central Processing Unit), which can execute various types of procedures and methods as described herein, for example, in accordance with programs stored in a read-only memory (ROM) 252, stored in a storage 257 and loaded into a random-access memory (RAM) 253, stored on a medium 260 which can be inserted in a respective drive 259, etc.
- the computer 250 includes an artificial intelligence (AI) processor 251a.
- the AI processor 251a may include a graphics processing unit (GPU) and/or a tensor processing unit (TPU).
- the AI processor 251a may be configured to execute an AI model (e.g., an artificial neural network), for example, the artificial neural network 18 of Fig.2.
- an AI model e.g., an artificial neural network
- the CPU 251, the ROM 252 and the RAM 253 are connected with a bus 261, which in turn is connected to an input/output interface 254.
- the number of CPUs, memories and storages is only exemplary, and the skilled person will appreciate that the computer 250 can be adapted and configured accordingly for meeting specific requirements which arise when it functions as an information processing apparatus according to the present technology.
- a medium 260 compact disc (CD), digital video disc (DVD), universal serial bus (USB) flash drive, secure digital (SD) card, CompactFlash (CF) memory, or the like
- a medium 260 compact disc (CD), digital video disc (DVD), universal serial bus (USB) flash drive, secure digital (SD) card, CompactFlash (CF) memory, or the like
- the input 255 can be a pointer device (mouse, graphic table, or the like), a keyboard, a microphone, a camera, a touchscreen, an eye-tracking unit etc.
- the output 256 can have a display (liquid crystal display (LCD), cathode ray tube (CRT) display, light-emitting diode (LED) display, electronic paper, etc.; e.g., included in a touchscreen), loudspeakers, etc.
- the storage 257 can have a hard disk drive (HDD), a solid-state drive (SSD), a flash drive and the like.
- the communication interface 258 can be adapted to communicate, for example, via universal serial bus (USB), MIPI, CSI, a serial port (RS-232), parallel port (IEEE 1284), a local area network (LAN; e.g., ethernet), wireless local area network (WLAN; e.g., Wi-Fi, IEEE 802.11), mobile telecommunications system (GSM, UMTS, LTE, NR etc.), Bluetooth, near-field communication (NFC), ZigBee, infrared, etc.
- USB universal serial bus
- MIPI serial port
- CSI serial port
- RS-232 serial port
- IEEE 1284 parallel port
- LAN local area network
- WLAN wireless local area network
- GSM Global System for Mobile communications
- GSM Global System for Mobile communications
- UMTS wireless local area network
- LTE Long Term Evolution
- NR wireless local area network
- Bluetooth near-field communication
- NFC near-field communication
- ZigBee ZigBee
- the communication interface 258 may support other radio access technologies than the mentioned UMTS, LTE and NR. It should be recognized that the embodiments describe methods with an exemplary ordering of method steps. The specific ordering of method steps is however given for illustrative purposes only and should not be construed as binding. Changes of the ordering of method steps may be apparent to the skilled person. Please note that the division of the circuitry 2 into units 3 to 6 is only made for illustration purposes and that the present disclosure is not limited to any specific division of functions in specific units. For instance, the circuitry 2 could be implemented by a respective programmed processor, field programmable gate array (FPGA) and the like.
- FPGA field programmable gate array
- the methods disclosed herein can also be implemented as a computer program causing a computer and/or a processor, such as the circuitry 2 discussed above, to perform the method, when being carried out on the computer and/or processor.
- a non- transitory computer-readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the method described to be performed.
- the circuitry is configured to input at least a part of the compressed image data to an artificial neural network and to execute the artificial neural network; wherein the artificial neural network is configured to: receive compressed image data as input; and perform the associating for phases represented by the received compressed image data.
- the inputted compressed image data are associated with a predefined number of phases; and wherein the artificial neural network is further configured to perform the generation of the output image data based on the inputted compressed image data.
- circuitry is further configured to: determine a number of phases that is associated with a predefined image quality of the output image data; and cause the image sensor to acquire the photon count for the determined number of phases.
- the deblurring includes estimating a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases.
- the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, the decompression kernel being configured to generate pixel values of the output image data based on a change of the pixel values across the plurality of phases.
- a method comprising: obtaining, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associating, for at least some of the plurality of phases, a portion of an object represented by the compressed image data; and generating output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
- the method of (12), wherein the deblurring of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases.
- the method comprises inputting at least a part of the compressed image data to an artificial neural network and executing the artificial neural network; wherein the artificial neural network performs: receiving compressed image data as input; and performing the associating for phases represented by the received compressed image data.
- the inputted compressed image data are associated with a predefined number of phases; and wherein the artificial neural network further performs the generation of the output image data based on the inputted compressed image data.
- the method of (21), wherein the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, the decompression kernel generating pixel values of the output image data based on a change of the pixel values across the plurality of phases.
- a computer program comprising program code causing a computer to perform the method according to anyone of (12) to (22), when being carried out on a computer.
- a non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method according to anyone of (12) to (22) to be performed.
- Circuitry configured to: obtain, from an image sensor, processed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object represented by the processed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the processed image data.
- Circuitry configured to: obtain, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning; associate, for at least some of the plurality of shifted-pixel binning phases, a portion of an object represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of shifted-pixel binning phases, image data that are based on the compressed image data.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Image Processing (AREA)
Abstract
The disclosure pertains to circuitry that is configured to: obtain, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
Description
CIRCUITRY AND METHOD TECHNICAL FIELD The present disclosure generally pertains to circuitry and a method. TECHNICAL BACKGROUND It is generally known to perform bust imaging (BI). BI includes acquiring a plurality of image frames with a short exposure and merging the plurality of image frames (which may also be known as micro-frames) into an image frame such that a motion blur in the resultant image frame is reduced as compared to an image obtained by one-shot imaging with a long exposure. Although there exist techniques for BI, it is generally desirable to provide improved circuitry and an improved method, to allow obtaining images in lower light conditions and with less motion blur. SUMMARY According to a first aspect, the disclosure provides circuitry that is configured to: obtain, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data. According to a second aspect, the disclosure provides a method that includes: obtaining, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associating, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data; and generating output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
Further aspects are set forth in the dependent claims, the drawings and the following description. BRIEF DESCRIPTION OF THE DRAWINGS Embodiments are explained by way of example with respect to the accompanying drawings, in which: Fig.1 illustrates an embodiment of an electronic device with a circuitry; Fig.2 illustrates an embodiment of a method performed by a circuitry; Fig.3 illustrates embodiments of binning photosensitive elements; Fig.4 illustrates an embodiment of a pixel binning in a first phase of shifted-pixel binning in a color array; Fig.5 illustrates an embodiment of a pixel binning in a second phase of shifted-pixel binning in a color array; Fig.6 illustrates an embodiment of a pixel binning in a third phase of shifted-pixel binning in a color array; Fig.7 illustrates an embodiment of a pixel binning in a fourth phase of shifted-pixel binning in a color array; Fig.8 illustrates an embodiment of generating an output high-resolution color image based on shifted-pixel binning; Fig.9 illustrates an embodiment of shifted-pixel binning in a grayscale array and generating an output high-resolution grayscale image based on shifted-pixel binning; Fig.10 illustrates an embodiment of generating output image data from compressed image data; Fig.11 illustrates an embodiment of generating a plurality of aligned frames from a shifted-pixel binning cycle; Fig.12 illustrates an embodiment of generating a plurality of output images from a shifted-pixel binning cycle; Fig.13 illustrates an embodiment of a neural network pyramidal workflow of shifted-pixel binning quanta burst imaging based on step window processing; Fig.14 illustrates an embodiment of a neural network pyramidal workflow of shifted-pixel binning quanta burst imaging based on sliding window processing;
Fig.15 illustrates an embodiment of a shifted-pixel binning quanta burst imaging data processing based on a recurrent neural network sequential workflow; Fig.16 illustrates an embodiment of a neural network pyramidal workflow for non-compressed quanta imaging sensing data; Fig.17 illustrates an embodiment of a recurrent neural network workflow for non-compressed quanta imaging sensing data; Fig.18 illustrates embodiments of pipelines; and Fig.19 illustrates an embodiment of a general-purpose computer. DETAILED DESCRIPTION OF EMBODIMENTS Before a detailed description of the embodiments under reference of Fig.1 is given, general explanations are made. As mentioned in the outset, it is generally known to perform bust imaging (BI). BI includes acquiring a plurality of image frames with a short exposure (e.g., 1/250 seconds, without limiting the disclosure to this value) and (aligning and) merging the plurality of image frames (which may also be known as micro-frames) into an image frame such that a motion blur in the image is reduced as compared to an image obtained by one-shot imaging with a long exposure (that may, for example, correspond to a sum of the short exposures of the plurality of image frames, e.g., 1/5 seconds, without limiting the disclosure to this value). BI may be used in photography and in video recordings. For example, for one-shot imaging, when a camera/scene moves while a exposure time is relatively long (which may be needed, e.g., in low-light scenes), the movement may result in motion-blurred images. Thus, a quality of the resulting images may be degraded for end users and/or for artificial intelligence (AI) features extraction (e.g., an AI may need a minimum signal- to-noise ratio (SNR) and/or image quality (e.g., XY definition, sharpness, etc.) to perform object detection/recognition). For burst imaging (BI) (e.g., quanta burst imaging (QBI)), multiple images may be acquired using a shorter exposure time. Due to the shorter exposure time, each image of the burst may include less motion-blur but also less SNR. For example, at a target distance of 0.59 m from a camera (e.g., an Iphone 13 Pro), a tangential velocity of the target of 1 m/s with respect to the camera may correspond to a blur of 5.13 pixels/ms of a resulting image (wherein the disclosure is not limited to these values). Therefore, reducing a motion blur may be reduced by reducing an
exposure time. After aligning the images of the burst and merging them, an image with lower motion-blur than one-shot imaging, but equivalent SNR may be obtained. For example, for illustration, a one-shot image may be acquired with a total exposure time of 1/5 s as reference. For burst imaging, a burst of 50 image frames with a exposure time of 1/250 s each may be acquired, resulting in a same total exposure time of 1/5 s as for the reference one- shot image. A result of the burst imaging after aligning and merging the burst of 50 image frames of 1/250 s exposure time each may correspond to a total exposure time of 1/5 s (i.e., the same total exposure time as for the reference one-shot image), but may include a reduced motion blur. It is noted that the technology is not limited to the values disclosed herein as exposure/shutter time and/or as number of image frames in a burst, and that any other suitable exposure/shutter time and/or number of image frames in a burst may be used instead. Further, to simplify the explanation, a sum of the exposures of the burst of image frames may correspond to a total exposure time of a one-shot image, but the technology is not limited thereto, and a sum of exposures of a burst of image frames may be different than a total exposure time of a reference one-shot image. Use cases of burst imaging may include obtaining a non-blurred image (or an image with a reduced motion blur), providing a higher frames-per-second (fps) value to a host, and/or generating a super-resolution image. However, when the number of counts per pixel is very small (e.g., in burst imaging), a read-out noise (additional to shot-noise) may be dominant in a conventional complementary metal-oxide- semiconductor (CMOS) imaging sensor (CiS), due to low ambient light and/or because a exposure time may be very small. Therefore, burst imaging with a conventional CiS may not achieve the same SNR as a one-shot image with the same total exposure time in some embodiments. A solution may involve using a sensor with a neglectable read-out noise, e.g., a sensor with quanta image capability (e.g., with a read out < 0.23 e- root mean square (rms)). Therefore, in some embodiments, Quanta Burst Imaging (QBI; also called quanta image sensing (QIS)) may be performed, e.g., burst imaging using a sensor with a read-out noise < 0.23 e- rms. A suitable sensor to do this may include a single-photon avalanche diode (SPAD) based sensor. However, QBI/QIS may require a high data-rate from a sensor to a host for transmitting acquired image frames from the sensor to the host. In some instances, a throughput between a sensor and a host is limited by current technologies. Each image of the burst may have a short bit-depth (e.g., instead of sending one frame with a bit-depth of 10 bit per pixel, the sensor may send 210 frames with a bit-depth of one bit per pixel. However, the disclosure is not limited to a bit-depth of one
bit per pixel. Burst imaging may be performed with any suitable bit-depth of the image frames, e.g., two bit per pixel, three per pixel, four bit per pixel or the like. For reducing a data-rate (throughput) from a sensor to a host in burst imaging, information may be compressed in the sensor before it is sent to the host. Further, a computational cost/time of QBI, although not linearly proportional, may be proportional to an amount of data that needs to be processed. Therefore, compressed information from the sensor may be used for reducing the amount of data that needs to be processed. Some embodiments may have a further benefit that may be based on an image pyramid. Further, in some embodiments that involve a conventional SPAD-based sensor array (which may have, e.g., a pixel pitch of less than 3 µm, without limiting the disclosure thereto) in a stack- sensor configuration, a technology of a logic-node of a logic-layer may determine a size (which may be linked with a price) of a final chip (when assuming that a device with a competitive price should be mass produced). For example, when designing a logic-layer that contains an 8-bit counter per pixel under a SPAD array, a pixel pitch may determine a necessary logic-node technology. The smaller the pixel pitch (e.g., for obtaining more megapixels per area, which may result in a cheaper sensor array), the smaller the technology for the logic-node may be needed (and the more expensive the logic-node may be). Therefore, a compression technique on the sensor hardware may be introduced that may avoid a use of dedicated bit-counters per pixel. This may allow to reduce a pixel-pitch, and at the same time may not demand a reduction in the logic- node technology. Consequently, some embodiments of the disclosure pertain to circuitry that is configured to: obtain, from an image sensor, compressed image data that are generated by the image sensor based on shifted-pixel binning (SPB), wherein the SPB is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data. The circuitry may include a programmed microprocessor, an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA) or the like, which may be configured to
execute software and/or firmware instructions, which may cause the circuitry to provide the functionality described herein. The circuitry may further include a memory (e.g., based on Dynamic Random-Access Memory (DRAM), Static Random-Access Memory (SRAM), flash memory, Electrically Erasable Programmable Read-Only Memory (EEPROM), magnetic memory and/or the like) that may store the instructions. The memory may further store data that are necessary for performing the functionality of the circuitry, e.g., the compressed image data, the association of the portion of the object and/or the output image data, as well as temporary data. The circuitry may also include a communication unit for communicating with one or more other devices. The communication unit may include a Mobile Industry Processor Interface (MIPI) Alliance interface (e.g., a Camera Serial Interface (CSI)), a Peripheral Component Interconnect (PCI) interface, a Universal Serial Bus (USB) interface, a serial port (RS-232) interface, a parallel port (IEEE 1284) interface, an Ethernet interface, a Wi-Fi (IEEE 802.11) interface, a Bluetooth interface or the like. For example, the circuitry may receive the compressed image data via the communication unit from the image sensor and/or may transmit the output image data via the communication unit to another information processing apparatus that may further process the output image data. The image sensor may include an array of photosensitive elements. The photosensitive elements may be arranged in a two-dimensional array with any suitable number of rows and columns, or in a one-dimensional array. The photosensitive elements may be based on single-photon avalanche diode (SPAD) technology and may be configured to detect a single photon based on an avalanche of charge carriers that may be caused by the single photon. However, the disclosure is not limited to SPADs. For example, the photosensitive elements may be based on jots and may be configured to detect a single photon without an avalanche of charge carriers. Nevertheless, SPADs may allow a smaller circuit size and/or a lower power consumption as compared to jots. The image sensor may be configured to perform shifted-pixel binning (SPB). For shifted-pixel binning, the image sensor may acquire a plurality of image frames, wherein each image frame may be associated with a phase of SPB. The image sensor may bin the photosensitive elements into bins according to a phase of SPB. For example, each phase may be associated with a predefined binning pattern for binning the photosensitive elements. For each image frame of the plurality of image frames, the image sensor may bin the photosensitive elements according to the binning pattern that is associated with the phase associated with the frame, and the image sensor may acquire one photon count per bin of photosensitive elements.
For the binning, the image sensor may connect all photosensitive elements of a bin to a single counter such that the counter may receive, via corresponding wiring, and count signals that may indicate a charge carrier avalanche caused by a photon in any photosensitive element (e.g., SPAD) of the bin. Alternatively, for the binning, the image sensor may sum photon counts from several counters each of which may receive, via corresponding wiring, and count signals that may indicate a charge carrier avalanche caused by a photon in one photosensitive element or in a subset of photosensitive elements of a bin. The skilled person may find further ways of binning photosensitive elements. The binning patterns may be based on a predefined kernel. The kernel may indicate a size and shape of the bins. For example, the kernel may indicate a size of 2x2 photosensitive elements, 3x3 photosensitive elements, 4x4 photosensitive elements, 1x2 photosensitive elements, 2x1 photosensitive elements, MxM photosensitive elements or MxN photosensitive elements (where M and N may be any suitable integers). The skilled person may find a suitable size for the kernel. The bins may have a same size according to a same kernel or may have different sizes according to different kernels. The bins may be shifted between subsequent phases of SPB according to the predefined binning patterns. For example, each bin may be shifted in a region of the array of photosensitive elements around a center-of-mass that may be associated with the bin. For example, the bins may be shifted, between subsequent phases, by one row and/or by one column of the array of photosensitive elements. For example, the bins may be shifted through the region of the array according to a predefined shifting pattern or randomly. Due to the shifting, the photon count acquired for a bin in the different phases may on average correspond to a photon count at the center-of-mass that is associated with the bin. The compressed image data may be compressed in that the image sensor may acquire a single photon count per bin and per image frame such that a number of acquired photon counts per image frame may be lower than a number of photosensitive elements of the image sensor. A compression ratio at which the compressed image data may be compressed may correspond to the kernel size of the kernel. Due to the compression, a smaller amount of data may have to be transmitted from the image sensor to the circuitry, such that the circuitry may be able to obtain the compressed image data at a higher frame rate, and/or a power consumption of the image sensor and/or of the circuitry may be reduced. A size and/or complexity of the image sensor and/or circuitry may also be reduced in some embodiments. Further, in some embodiments, a number of counters that is required in a
logic layer of the image sensor may be lower than the number of photosensitive elements of the image sensor, such that a larger logic-node technology may be used although a smaller pixel pitch may be realized. The circuitry may be configured as a host of the image sensor that receives the compressed image data from the image sensor. The obtaining of the compressed image data may include causing the image sensor to generate the compressed image data and receiving the compressed image data from the image sensor via the communication unit. For generating the compressed image data, the image sensor may acquire a plurality of image frames based on SPB (e.g., one image frame per phase of the SPB), as described above, wherein the image sensor may acquire one photon count per bin of photosensitive elements for each image frame. The image sensor may generate the compressed image data such that the compressed image data may include a value that is based on an acquired photon count for each bin of photosensitive elements and for each image frame. The value that is based on the acquired photon count may be configured as a Boolean value, as an integer value or as a floating-point value. For example, the value may indicate a number of counted photons or may be converted to a value within a predefined range (e.g., within a range of values that can be represented by a data type of the value). For example, the value may be chosen as a one-bit value (e.g., zero/false or one/true), as a two-bit value (e.g., from zero to three), as a four-bit value (e.g., from zero to 15), as a floating-point value (e.g., from 0.0 to 1.0), as a percentage (e.g., from 0 % to 100 %) or the like. A data type and range of the value is not limited to these examples, and the skilled person may find a suitable data type and range for the value. For example, the value may represent one pixel of an image frame of the compressed image data. For example, the image sensor may transmit the compressed image data as a stream, such that the circuitry may process earlier frames of the compressed image data before the image sensor has generated later frames of the compressed image data. The circuitry may accumulate a predefined number of subsequent image frames of the compressed image data and generate the output image data based on the accumulated subsequent image frames. The image sensor may acquire the image frames at a low exposure time, e.g., at a frame rate of 250 fps, 500 fps or 1000 fps (without limiting the disclosure to these values). Therefore, the image frames of the compressed image data may have a low signal-to-noise ratio (SNR). The circuitry may merge subsequent frames of the compressed image data into an output frame for
obtaining a higher SNR. For example, the circuitry may merge pixel values of the subsequent frames such that a shot noise may average out and the output frame may have a higher SNR. For generating the output image data, the circuitry may upsample the image frames of the compressed image data. For example, the circuitry may generate the output image data such that a number of rows and/or columns of pixels in an output frame of the output image data corresponds to a number of rows and/or columns, respectively, of the array of photosensitive elements in the image sensor. However, the disclosure is not limited to upsampling the output image data to a resolution of the array of photosensitive elements. The circuitry may as well upsample the output image data to a resolution that may be lower or higher than the array of photosensitive elements, or may omit the upsampling and instead generate the output image data with a resolution that corresponds to the number of bins. When generating the compressed image data, the image sensor may, in some cases, image an object that moves with respect to the image sensor. For example, the image sensor may be directed at a scene and may receive light from the scene (e.g., through a lens, a mirror or the like, which may focus the light on the image sensor). Thus, each photosensitive element of the image sensor may correspond to a position in the scene, and light detected by a photosensitive element of the image sensor may indicate an optical property (e.g., color, brightness etc.) of an object at the corresponding position in the scene. An object in the scene may move while the image sensor is acquiring the image frames for SPB. For example, in the scene, a person may walk, jump, move an arm or a leg, or the like; an animal (e.g., cat, dog, bird, fish etc.) may walk, jump, fly, swim or the like; a car or other vehicle may drive; an airplane or helicopter may fly; a boat or ship may swim; or the like. A position of the moving object may change between the phases of SPB and, thus, between the image frames. Likewise, the image sensor may be moved (e.g., translated or rotated or distance change) with respect to the scene, or a lens or mirror may be moved (e.g., zoomed or panned) in front of the image sensor such that light from another portion of the scene may be focused on the image sensor. Therefore, the circuitry may associate, for at least some of the plurality of phases, a portion of an object that is represented by the compressed image data. The portion may be represented by one or more pixels, by a set of pixels, by a wavelet, by an edge, by a corner and/or by any other suitable image feature in the compressed image data. The circuitry may associate the portion of the object in subsequent image frames of the compressed image data (e.g., for each or some of the plurality of phases), such that in each image frame (e.g., for each phase, or for some of the
phases) a position may be identified at which the portion of the object is represented (e.g., for those phases of the plurality of phases for which a position of the portion of the object can be determined). For example, the associating of the portion of the object may include associating positions in the compressed image data for (at least some of) the plurality of phases (e.g., in the image frames acquired for the respective phases) with the portion of the object. The associating of the portion of the object may include matching (e.g., within a predefined tolerance or threshold) and/or identifying (representations of) the portion of the object in the compressed image data for (at least some of) the plurality of phases. The associating of the portion of the object may include determining/identifying, for (at least some of) the plurality of phases, portions in the compressed image data that correspond to (e.g., represent) the portion of the object. By the associating, the circuitry may determine an association of the portion of the object with positions (e.g., pixels) for (at least some of) the plurality of phases (e.g., in the image frames) in the compressed image data so as to track a movement of the (portion of the) object through (at least some of) the phases (image frames). Similarly, the circuitry may associate a plurality of portions of the object and/or a plurality of portions of a plurality of objects that are represented by the compressed image data. The plurality of objects may include objects that move and objects that do not move (e.g., that are static) throughout the plurality of phases. Thus, with the associating, the circuitry may track the portion(s) of the object(s) through the phases (image frames) of the compressed image data. Based on the determined association, portions of the image data of the plurality of phases may be aligned. For example, for generating the image data that are based on the compressed image data, the circuitry may arrange pixels of the compressed image data of the plurality of phases according to the determined association. The image data may include an image frame in which pixels of compressed image data of the plurality of phases are arranged. The image frame of the image data may have a higher resolution (e.g., may have more pixels than) image frames of the compressed image data in a vertical and/or in a horizontal direction. For example, a number of pixels of the image frame of the image data may correspond to a product of a number of pixels per image frame of the compressed image data and a number of phases which compressed image data are included in the image data. For example, the number of pixels of the image frame of the image data may correspond to a sum of numbers of pixels of all image frames of the compressed image data that are included in the image data. Accordingly, the image data may be configured
as a pre-output array, in which pixels of the compressed image data are arranged (e.g., aligned) according to the determined association. The association (and, in some embodiments, an alignment based on the association) may be performed based on computer vision (CV; e.g. image processing) methods and/or based on an machine learning method such as an artificial neural network. Since the compressed image data may represent a moving object at different positions for different phases (e.g., in different image frames), a motion blur caused by a movement of the object may be reduced by arranging (e.g., aligning) the pixels of the compressed image data according to the determined association. For example, the circuitry may interpolate the movement of the (portion of the) object for determining the position of the (portion of the) object in the image data (e.g., in the pre-output array), or the circuitry may select a phase (and, thus, an image frame) of the compressed image data from which a position of the (portion of the) object should be adopted in the image data (e.g., pre-output array). However, the circuitry may perform further processing on the image data (e.g., to the pre-output array) before generating the output image data. Due to the SPB, each pixel of the compressed image data may represent a plurality of photosensitive elements of the image sensor according to the binning. Thus, the image data may appear blurred (e.g., defocused). For generating the output image data, the circuitry may perform a deblurring of the image data (e.g., the compressed image data or the pre-output array) for reducing the blur (appearance of being defocused) when merging the image frames of the plurality of phases. The deblurring may be based on determining (e.g., calculating, estimating or the like) a position and/or representation of the portion of the object in an output frame of the output image data. The circuitry may determine the position and/or representation of the portion of the object in the output frame based on the association and, thus, based on the binning of the photosensitive elements of the image sensor and, if (the portion of) the object has moved, also based on the tracked movement of the (portion of the) object. For example, the circuitry may reconstruct, based on the association and on the binning, a detail of (the portion of) the object from the compressed image data. The circuitry may then sum pixel values of the image data (e.g., of the pre-output array which may correspond to upsampled compressed image data) of the positions of the portion of the object from (at least some of) the plurality of phases (e.g., from the image frames) and may assign the resulting summed pixel value(s) to one or more pixels of the output image data (e.g., of an output frame) at a position of the portion of the object in the output frame. The deblurring of the (upsampled compressed) image data and/or the generating of the output image data is not
limited to summing pixel values, but may include further image processing techniques such as adjusting a brightness and/or a color of one or more image frames and/or applying an image filter. The deblurring may be based on analytical methods (e.g., based on detecting edges of objects, wherein a edge may be determined as boundary between adjacent areas of different colors or textures, wherein a difference of colors or textures between the areas may exceed a predefined threshold) and/or on a machine learning method (e.g., an artificial neural network). The circuitry may associate, align and deblur each detected object (or portion thereof) separately from other (portions of) objects represented by the compressed image data. The output image data, which are based on deblurring the image data, may represent the scene with a detailedness that corresponds to the number of photosensitive elements of the image sensor. Thus, the deblurring may correspond to a decompression that reconstructs a representation of the scene based on the compressed image data that have been compressed by SPB. Instead of generating, as the image data, the pre-output array and performing the deblurring on the pre-output array, the circuitry may use the compressed image data as the image data (that are based on the compressed image data and) that are deblurred. For example, the circuitry may pixels of the compressed image data direct into a deblurring algorithm according to the determined association, without generating an intermediate pre-output array. For example, the circuitry may input the compressed image data to an artificial neural network, and the artificial neural network may perform the associating, the deblurring, and the generating of the output image data. The circuitry may be included in an electronic device that is capable of taking images. The electronic device may include, e.g., a camera, a smartphone, a tablet, a notebook, a head- mounted display (HMD), smartglasses, a smartwatch, a microscope, a (manually driven or autonomously driving) vehicle, a drone, a robot or the like. In some embodiments, the deblurring of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases. The deblurring may reconstruct a representation of the portion of the object for each (or at least some) of the plurality of phases such that a representation of the portion of the object in the output image data matches (e.g., is consistent) across the plurality of phases (or that
contributions from the plurality of phases to the representation of the portion of the object in the output image data correspond to a same representation of the object. For example, in a case of four SPB phases in which compressed image data are generated, and if the circuitry is configured to use a latest phase of the four phases as a reference alignment frame (whereas the circuitry may also be configured to align to any other of the phases in some embodiments), (portions of) objects that appear in the output image data may be moved (aligned) from their positions in image frames of the compressed image data of the respective phases to their position in the reference alignment frame according to the association, and the deblurring may further refine an appearance of (the portions of) the objects in the output frame. The transforming may include translating, rotating, scaling, shearing or the like according to a movement of the associated portion of the object such that representations (e.g., pixels) from multiple phases (e.g., image frames) of the compressed image data may be superimposable. In some embodiments, the circuitry is configured to input at least a part of the compressed image data to an artificial neural network and to execute the artificial neural network; and the artificial neural network is configured to: receive compressed image data as input, and perform the associating for phases represented by the received compressed image data. The artificial neural network may include a Feed-Forward Network, a Residual Network (ResNet), a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN), a Generative Adversarial Network (GAN), a Transformer Neural Network, a Neural Radiance Field (NeRF), a Variational Autoencoder (VAE), and/or any other suitable neural network architecture. The skilled person may find a suitable architecture for the artificial neural network based on his expert knowledge. The artificial neural network may implement a generative artificial intelligence (AI). For example, the generative AI may use or may be configured similar to DALL-E, Midjourney, Stable Diffusion, or the like. The circuitry may include an AI unit (e.g., a graphics processing unit (GPU) and/or a tensor processing unit (TPU)) for executing the artificial neural network (e.g., for computing an output of the artificial neural network). The artificial neural network may include input nodes, hidden layers and output nodes. The circuitry may provide the compressed image data (e.g., a predefined number of subsequent image frames of the compressed image data) to input nodes of the artificial neural network, execute the artificial neural network and read an output of the artificial neural network from the output nodes.
Alternatively, the circuitry may be connected to an external AI unit, may transfer an input for the artificial neural network to the external AI unit, and may receive an output of the artificial neural network from the external AI unit. The associating performed by the circuitry may include causing the artificial neural network to determine the association of the portion(s) of the object(s) as described above. In some embodiments, the inputted compressed image data are associated with a predefined number of phases; and the artificial neural network is further configured to perform the generation of the output image data based on the inputted compressed image data. For example, the circuitry may input, to the artificial neural network, compressed image data that may be associated with one, two, three or more phases (without limiting the disclosure to these values). For example, the circuitry may input, to the artificial neural network, a number of phases that corresponds to one SPB cycle. The artificial neural network may perform the associating iteratively. For example, the artificial neural network may determine the association with respect to compressed image data for which the artificial neural network has determined an association in a previous run. Alternatively, the artificial neural network may perform the associating between (at least some of) the phases of the inputted compressed image data, and the alignment and the deblurring may be based on associations from several runs of the artificial neural network. Alternatively, the circuitry may input, to the artificial neural network, compressed image data that may be associated with as many phases based on which the output image data (e.g., an output frame) should be generated, such that the artificial neural network may determine the association for as many phases as necessary for generating (an output frame of) the output image data. In some embodiments, the inputted compressed image data represent at least one of the plurality of phases; and the artificial neural network is further configured to: receive, as input, previously determined output image data; and associate the portion of the object between the inputted compressed image data and the previously determined output image data. The circuitry may input, to one or more input nodes of the artificial neural network, at least one image frame of the compressed image data, wherein the inputted image frame may be associated with one of the plurality of phases (e.g., the image sensor may have acquired the image frame in the phase of SPB with which the image frame may be associated). The circuitry may further input, to one or more input nodes of the artificial neural network, output image data which the circuitry may have determined previously. The output image data
inputted to the artificial neural network may include an output frame of the output image data (e.g., a latest generated output frame). The associating of the portion of the object between the inputted compressed image data (e.g., the inputted image frame) and the previously determined output image data (e.g., the inputted output frame) may include identifying a position of the portion of the object in the inputted image frame of the compressed image data, identifying a position of the portion of the object in the inputted output frame of the output image data, and associating the identified positions of the portion of the object in the inputted image frame and in the inputted output frame, respectively, with the (portion of the) object. Thus, the artificial neural network may track the portion of the object between the previously determined output image data (e.g., the inputted output frame) and the inputted compressed image data (e.g., the inputted image frame). In some embodiments, the artificial neural network is further configured to weight the previously determined output image data differently than the received part of the compressed image data. The weighting may include generating an output of the artificial neural network with unequal contributions from the previously determined output image data and the received compressed image data, respectively. For example, the artificial neural network may weight the received compressed image data (e.g., the inputted image frame(s)) stronger than the previously determined output image data (e.g., the inputted output frame), such that an output of the artificial neural network may be closer to the received compressed image data than to the previously determined output image data. Thus, the artificial neural network may avoid a too strong impact of the previously determined output image data on a present generation of output image data. For example, the weighting may reduce a risk that an output of the artificial neural network (and/or output image data that may be generated based on the output of the artificial neural network) represents the portion of the object in a state (e.g., at a position) prior to a point in time that corresponds to the received (image frame(s) of the) compressed image data. The weighing may correspond to a multiplication with a coefficient. For example, the received compressed image data may be weighted with a coefficient of 1, and the previously determined output image data may be weighted with a coefficient of 0.9, 0.5, 0.1, 0.01 or the like. For example, the previously determined output image data may be weighted with a coefficient of 1, and the received compressed image data may be weighted with a coefficient of 1.5, 2, 5, 10, 100 or the like. It is noted that the disclosure is not limited to these values or value ranges or to a
multiplication with a coefficient, and the skilled person may find other suitable coefficients, value ranges and/or techniques for the weighting. In some embodiments, the output image data include a frame whose number of pixels exceeds a number of photosensitive elements that have acquired the photon count. For example, the output image data may correspond to super-resolution image data and may be generated with any suitable technique for generating a super-resolution image. For example, the output image data may include 1.5, 2, 3, 4 or the like times as many pixels in a horizontal and/or in a vertical direction of an output frame than there are photosensitive elements in a horizontal and/or vertical direction, respectively, in the array of photosensitive elements of the image sensor. However, the disclosure is not limited to these resolutions or to this range of resolutions. The circuitry may generate the super-resolution image by a (e.g., linear or cubic) interpolation of pixels and/or may cause an artificial neural network to generate the super-resolution image (e.g., the artificial neural network that may be used for associating the portion of the object and/or another artificial neural network that may be trained for generating a super-resolution image). In some embodiments, the shifted-pixel binning is based on: binning the photosensitive elements at a plurality of kernel sizes; and acquiring the photon count for the correspondingly binned photosensitive elements for each respective kernel size simultaneously. The kernel size may correspond to a number of photosensitive elements of the image sensor that may be binned for acquiring a photon count, as described above. For example, the image sensor may acquire photon counts simultaneously from a bin of 4x4 photosensitive elements and from four bins of 2x2 photosensitive elements, wherein the four bins of 2x2 photosensitive elements may be included in the bin of 4x4 photosensitive elements. For example, the image sensor may acquire photon counts simultaneously from a bin of 6x6 or 9x9 photosensitive elements and from four or nine bins of 3x3 photosensitive elements, wherein the bins of 3x3 photosensitive elements may be included in the bin of 6x6 or 9x9 photosensitive elements. However, the disclosure is not limited to these kernel sizes or to quadratic bins. For example, a number of photosensitive elements of a bin in a horizontal direction may differ from a number of photosensitive elements of the bin in a vertical direction. Bins of a smaller kernel size may also overlap more than one bin of a larger kernel size. Further, photon counts from overlapping bins of more than two different kernel sizes may be acquired simultaneously. Acquiring the photon count from bins with different kernel sizes simultaneously may allow for associating the portion of the object at different levels of granularity. For example, the
association may first be determined on a coarse level based on a larger kernel size, and then on a fine level based on a smaller kernel size, wherein the associating on the fine level may be based on bins with the smaller kernel size that overlap bins with the larger kernel size that have been associated with the portion of the object. Therefore, acquiring the photon count from bins with different kernel sizes simultaneously may allow to reduce an amount of processing needed for the associating, and thus may increase a performance (e.g., frame rate and/or quality of the output image data) and/or reduce a consumption of electric power and/or of computing resources. In some embodiments, the circuitry is further configured to: determine a number of phases that is associated with a predefined image quality of the output image data; and cause the image sensor to acquire the photon count for the determined number of phases. The predefined image quality may include a pixel resolution of the output image data, a frame rate of the output image data, a temporal resolution for tracking the portion of the object based on the compressed image data, a sharpness to be established by the association, alignment and/or deblurring, or the like. The number of phases may be associated with a number of image frames from the compressed image data on which an output frame of the output image data may be based and/or with a binning pattern (e.g., number and/or size of bins) of the SPB. For example, the circuitry may determine the number of phases from a table in a storage of the circuitry, may calculate the number of phases from a specification of the predefined image quality, may estimate the number of phases based on an image quality of the compressed image data, or the like. Thus, the circuitry may select a number of phases such that the predefined image quality may be ensured. The circuitry may also select a number of phases such that an unnecessarily large number of phases (and, thus, unnecessary consumption of computing and/or energy resources) may be avoided. In some embodiments, the deblurring includes estimating a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases. The estimating may reconstruct a detailedness of the portion of the object that are not visible in the (compressed) image data due to the binning. In addition, in a case when the (portion of the)
object changes between the phases (e.g., moves to another position, changes an orientation, changes color, changes brightness or the like), the deblurring may generate the output image data such that they may represent the (portion of the) object as it may appear at the predefined point in time. The predefined point in time may correspond to a time at which the photon count for a first phase, for a last phase, for a median phase or the like of a SPB cycle may have been acquired. The predefined point may also correspond to a mean time during which the image sensor has acquired compressed image data for a SPB cycle, to a time at which the (portion of the) object is in a predefined state, or the like. The circuitry may also estimate a representation of the portion of the object at multiple predefined times, e.g., for each phase of a SPB cycle, or at points in time that may be distributed over a SPB cycle according to a predefined frame rate of the output image data. Prior to the deblurring, the circuitry may align, for generating the pre-output array in accordance with the association, portions (e.g., pixels) of the compressed image data to the predefined point in time. Thus, a blur in the compressed image data, which may be caused by the binning of photosensitive elements or by a change or movement of the (portion of the) object between the phases, may be reduced by the estimating of the representation of the portion of the object, such that the output image data may include a sharper representation of the portion of the object. For example, the deblurring may restore more details of the scene than the association (and corresponding aligning). In some embodiments, the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, wherein the decompression kernel is configured to generate pixel values of the output image data based on a change of the pixel values across the plurality of phases. For example, the decompression kernel may upsample the image data (e.g., the compressed image data, or image data that are based on the compressed image data, such as the pre-output array) to a resolution that may correspond to a number of photosensitive elements in the image sensor, and may estimate (e.g., interpolate, extrapolate, average etc.) pixel values. The decompression kernel may transform (e.g., translate, rotate, scale, shear etc.) the pixel values of the compressed image data that correspond to the portion of the object (e.g., according to the association) such that they may overlap. The decompression kernel may sum, average, interpolate etc. the overlapping pixel values for generating the pixel values of the output image
data. The decompression kernel may be based on analytic processing and, thus, may allow for generating the output image data without executing an artificial neural network. Some embodiments pertain to a method that includes: obtaining, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associating, for at least some of the plurality of phases, a portion of an object represented by the compressed image data; and generating output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data based on the compressed image data. The method may be performed by the circuitry described above. For any feature described with respect to the circuitry, embodiments of the method may include a corresponding feature. The methods as described herein are also implemented in some embodiments as a computer program causing a computer and/or a processor to perform the method, when being carried out on the computer and/or processor. In some embodiments, also a non-transitory computer- readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the methods described herein to be performed. Some embodiments pertain to circuitry that is configured to: obtain, from an image sensor, processed image data that have been generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object that is represented by the processed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the processed image data. Some embodiments pertain to circuitry that is configured to: obtain, from an image sensor, compressed image data that have been generated by the image sensor based on shifted-pixel binning; associate, for at least some of the plurality of shifted-pixel binning phases, a portion of an object that is represented by the compressed image data; and generate output image data by
deblurring, based on the association of the portion of the object for at least some of the plurality of shifted-pixel binning phases, image data that are based on the compressed image data. Furthermore, the features described above with respect to the circuitry and/or to the method may be combined in any suitable way. Returning to Fig.1, Fig.1 illustrates an embodiment of an electronic device 1 with a circuitry 2. The electronic device 1 is configured as a camera for taking images and movies. The electronic device 1 includes the circuitry 2, which is an example of the circuitry described above. The circuitry 2 includes a control unit 3, an artificial intelligence (AI) unit 4, a storage unit 5 and a communication unit 6. The control unit 3 includes a CPU and is configured to control a functionality of the electronic device 1. The AI unit 4 includes a GPU and is configured to execute an artificial neural network. The storage unit 5 includes a volatile DRAM storage and a non-volatile flash storage, and is configured to store instructions for the control unit 3, parameters of an artificial neural network executed by the AI unit 4, and any further data that are necessary for a functionality of the circuitry 2. The communication unit 6 includes interfaces for communicating with other devices, including a Camera Serial Interface (CSI) for receiving compressed image data, and a Universal Serial Bus (USB) interface for transmitting output image data generated by the circuitry. The electronic device 1 further includes an image sensor 7 and a lens 8. The image sensor 7 includes a two-dimensional array 7a of single-photon avalanche diodes (SPADs), which are examples of photosensitive elements. The image sensor 7 also includes a logic layer 7b that is laminated to the array 7a of SPADs on a side that is opposite to a light-incident side of the array 7a of SPADs. The lens 8 focuses light from a scene on the array 7a of SPADs. The image sensor 7 is configured to generate compressed image data based on shifted-pixel binning (SPB), as described with respect to Fig.3 to 9, and to transmit the generated compressed image data via CSI to the circuitry 2. The circuitry 2 is configured to receive and process the compressed image data and to generate output image data according to any one of the methods described below with respect to Fig.10 to 18. Fig.2 illustrates an embodiment of a method 10 performed by the circuitry 2 of Fig.1. At 11, the control unit 3 determines a number of phases that is associated with a predefined image quality of the output image data.
At 12, the control unit 3 causes the image sensor 7 to acquire the photon count for the determined number of phases. At 13, the image sensor 7 generates compressed image data based on shifted-pixel binning (SPB). The generating of the compressed image data based on SPB includes binning, at 14, SPADs of the array 7a of SPADs (which are examples of photosensitive elements) for as many phases as determined at 11 according to a predefined binning pattern at a plurality of kernel sizes, and acquiring, at 15, in each phase a photon count for the correspondingly binned SPADs for each respective kernel size simultaneously. At 16, the communication unit 6 obtains, from the image sensor 7, the compressed image data, which have been generated at 13. At 17, the circuitry 2 associates, for at least some of the plurality of phases, a portion of an object represented by the compressed image data. For the associating of the portion of the object, the control unit 3 inputs at least a part of the compressed image data to an artificial neural network 18, and the AI unit 4 executes the artificial neural network. As indicated by an arrow in Fig.2, the artificial neural network 18 receives compressed image data as input, and performs the associating for phases represented by the received compressed image data. The inputted compressed image data represent at least one of the plurality of phases. The artificial neural network 18 further receives, as input, previously determined output image data; and associates the portion of the object between the inputted compressed image data and the previously determined output image data. For the associating, the artificial neural network weights, at 19, the previously determined output image data differently than the received part of the compressed image data. At 20, the circuitry 2 generates output image data. The generating of the output image data includes upsampling, at 21, the compressed image data. By the upsampling at 21, the circuitry generates image data that include a pre-output array. The generating of the output image data further includes deblurring, at 22, the image data, which have been obtained at 21 based on the compressed image data, based on the association of the portion of the object. The compressed image data, which are inputted to the artificial neural network 18, are associated with a predefined number of phases; and the artificial neural network 18 generates the output image data based on the inputted compressed image data, as indicated by an arrow in Fig.2.
The upsampling at 21 includes generating a super-resolution image, such that the output image data include a frame whose number of pixels exceeds a number of photosensitive elements that have acquired the photon count. The deblurring at 22 of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases. The deblurring at 22 further includes estimating, at 23, a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases. The estimating at 23 includes determining, at 24, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying, at 25, a decompression kernel 26 to the determined pixel values. The decompression kernel 26 generates pixel values of the output image data based on a change of the pixel values across the plurality of phases. Fig.3 illustrates embodiments of binning photosensitive elements. A first embodiment 30 is based on cathode sharing. Two SPADs 31 and 32 (which are examples of photosensitive elements) are each connected to a respective resistor 33 and 34. Cathodes of the SPADs 31 and 32 are connected by a switch 35 such that photon detection signals from any one of the SPADs 31 and 32 is received by a NOT gate 36, which is connected to the cathode of the SPAD 32. An output from the NOT gate 36 is provided to a counter 37. The counter 37 counts photon detection signals and outputs a signal that indicates a photon count to a scanning circuit. A second embodiment 40 is based on pulse summing. Two SPADs 41 and 42 (which are examples of photosensitive elements) are each connected to a respective resistor 43 and 44. Cathodes of both SPADs 41 and 42 are connected to a respective NOT gate 45 and 46. An output of the NOT gates 45 and 46 is provided to an OR gate 47 such that photon detection signals from any one of the SPADs 41 and 42 are merged into an output of the OR gate 47. The OR gate 47 provides its output to a counter 48. The counter 48 counts photon detection signals and outputs a signal that indicates a photon count to a scanning circuit. A third embodiment 50 is based on signal summing. Two SPADs 51 and 52 (which are examples of photosensitive elements) are each connected to a respective resistor 53 and 54. Cathodes of both SPADs 51 and 52 are connected to a respective NOT gate 55 and 56 and, from the NOT gate 55 and 56, to a respective counter 57 and 58. The counters 57 and 58 count photon detection
signals from the respective SPADs 51 and 52, and provide a respective photon count to an adder 59. The adder 59 adds the photon counts from the counters 57 and 58 and outputs a signal that indicates a summed photon count to a scanning circuit. The SPADs 31, 32, 41, 42, 51 and 52 are examples of SPADs (photosensitive elements) of the array 7a of SPADs of Fig.1. The counters 73, 48, 57 and 58 are examples of structures in the logic layer 7b of Fig.1. Any one of the embodiments 30, 40 and 50 may be applied to the image sensor 7 for performing shifted-pixel binning (SPB). The embodiments 30, 40 and 50 may be adapted for binning more than two SPADs (e.g., three SPADs, four SPADs or any other suitable number of SPADs / photosensitive elements), as the skilled person may appreciate. Fig.4 illustrates an embodiment of a pixel binning in a first phase (Phase #1) of shifted-pixel binning (SPB) in a color array 60. The color array 60 is an example of the array 7a of photosensitive elements of Fig.1. The color array 60 includes a plurality of SPADs (which are examples of photosensitive elements) that are configured to detect single photons of a predefined color (wavelength range), wherein each SPAD includes a color filter such that the SPAD detects light of its associated color. The SPADs of the color array 60 are arranged in a Bayer pattern, and is configured as a color-filter array of an RGGB type. SPADs labeled as “R” are configured to detect red light, SPADs labeled as “G” or “G’” are configured to detect green light, and SPADs labeled as “B” are configured to detect blue light. An index of the labels of a SPAD denotes a position of the SPAD in the array 60. In Phase #1 of SPB, the green SPADs G’11, G’12, G’21 and G’22 are binned, from which the image sensor 7 acquires a photon count 61. Further, the green SPADs G’13, G’14, G’23 and G’24 are binned, from which the image sensor 7 acquires a photon count 62. Also, the green SPADs G’31, G’32, G’41 and G’42 are binned, from which the image sensor 7 acquires a photon count 63. Further, the green SPADs G’33, G’34, G’43 and G’44 are binned, from which the image sensor 7 acquires a photon count 64. Fig.5 illustrates an embodiment of a pixel binning in a second phase (Phase #2) of SPB in the color array 60. In Phase #2 of SPB, the green SPADs G’12, G’13, G’22 and G’23 are binned, from which the image sensor 7 acquires a photon count 65. Also, the green SPADs G’32, G’33, G’42 and G’43 are binned, from which the image sensor 7 acquires a photon count 66. Further, the green SPADs G’14 and G’24 are binned with further SPADs (not shown in Fig.5), from which the image sensor 7 acquires a photon count 65a. Also, the green SPADs G’34 and G’44 are binned
with further SPADs (not shown in Fig.5), from which the image sensor 7 acquires a photon count 66a. Fig.6 illustrates an embodiment of a pixel binning in a third phase (Phase #3) of SPB in the color array 60. In Phase #3, the green SPADs G’22, G’23, G’32 and G’33 are binned, from which the image sensor 7 acquires a photon count 67. Fig.7 illustrates an embodiment of a pixel binning in a fourth phase (Phase #4) of SPB in the color array 60. In Phase #4 of SPB, the green SPADs G’21, G’22, G’31 and G’32 are binned, from which the image sensor 7 acquires a photon count 68. Also, the green SPADs G’23, G’24, G’33 and G’34 are binned, from which the image sensor 7 acquires a photon count 69. Fig.8 illustrates an embodiment of generating an output high-resolution color image 72 based on SPB. The circuitry 2 obtains compressed image data that are based on the photon counts 61 to 69 from the Phases #1 to #4, as shown in Fig.4 to 7, and merges the compressed image data into a pre-output array 70. In the pre-output array 70, a pixel G’11 is based on the photon count 61, a pixel G’12 is based on the photon count 65, a pixel G’13 is based on the photon count 62, a pixel G’14 is based on the photon count 65a, a pixel G’21 is based on the photon count 68, a pixel G’22 is based on the photon count 67, a pixel G’23 is based on the photon count 69, a pixel G’31 is based on the photon count 63, a pixel G’32 is based on the photon count 66, a pixel G’33 is based on the photon count 64, and a pixel G’34 is based on the photon count 66a. Thus, some pixel values in a G’-channel of the pre-output array 70 are based on an output O_G’11, O_G’13, O_G’31, and O_G’33 of Phase #1, wherein: ^^_^^^ᇱ ^ = ^^^ᇱ ^ + ^^^ᇱ ଶ + ^^ଶᇱ ^ + ^^ଶ ᇱ ଶ
^ = ^ + ଶ + ^ + ^^_^^ଷᇱ ଷ = ^^ଷᇱ ଷ + ^^ଷᇱ ସ + ^^ସᇱ ଷ + ^^ସ ᇱ ସ Similarly, some pixel values in the G’-channel of the pre-output array 70 are based on an output O_G’12, O_G’14, O_G’32, and O_G’34 of Phase #2, wherein: ^^_^^^ᇱ ଶ = ^^^ᇱ ଶ + ^^^ᇱ ଷ + ^^ଶᇱ ଶ + ^^ଶ ᇱ ଷ ^^_^^^ᇱ ସ = ^^^ᇱ ସ + ^^^ᇱ ହ + ^^ଶᇱ ସ + ^^ଶ ᇱ ହ ^^_^^ଷᇱ ଶ = ^^ଷᇱ ଶ + ^^ଷᇱ ଷ + ^^ସᇱ ଶ + ^^ସ ᇱ ଷ ^^_^^ଷᇱ ସ = ^^ଷᇱ ସ + ^^ଷᇱ ହ + ^^ସᇱ ସ + ^^ସ ᇱ ହ
This algorithm is followed accordingly for the further phases (Phases #3 and #4) and channels (R, G, B) for generating the pre-output array 70. Fig.4 to 8 show four phases of a 2x2 SPB kernel, wherein the circuitry 2 aligns all objects in an imaged scene across phases. It is noted that Fig.4 to 8 show binning of SPADs and generation of the pre-output array 70 only for some SPADs/pixels of the G’-channel for illustration purposes. For generating the compressed image data, the image sensor 7 performs the processing illustrated in Fig.4 to 8 for further SPADs of the G’-channel as well as for the R-, G- and B- channel accordingly. In the pre-output array 70, the phases are registered, wherein values based on photon counts from the phases are registered (arranged) in a correct location. The pre-output array 70 has a high resolution (i.e., a resolution that corresponds to a number of SPADs of the array 7a). However, the pre-output array 70 is still blurred (in particular if the compressed image data represent a portion of an object that has been moving between the phases) and, thus, does not have a required quality. The circuitry 2 performs a decompression 71 of the pre-output array 70. The decompression 71 is an example of the upsampling 21 and the deblurring 22 of Fig.2. For the decompression 71, the circuitry 2 may use an analytical method (e.g., the decompression kernel 26) and/or the artificial neural network 18. With the decompression 71, the circuitry 2 generates an output high-resolution color image 72 with a recovered quality (which includes a reduced motion blur of moving objects). The output high-resolution color image 72 is an example of output image data. It is noted that, although the SPB cycle illustrated in Fig.4 to 8 includes four phases, the disclosure is not limited to four phases. For example, an SPB cycle may include two phases, three phases, six phases, eight phases, nine phases, or any other suitable number of phases. It is further noted that the assignment of the arrows to pixels in the pre-output array 70, as shown in Fig.8, is provided as an example. The assignment may change in accordance with the association of a portion of an object represented by the compressed image data 61 to 69, e.g., if the portion of the object is determined to have changed between the phases of Fig.4 to 7. Fig.9 illustrates an embodiment of SPB in a grayscale array 80 and generating an output high- resolution grayscale image 83 based on SPB. The grayscale array 80 is an example of the array 7a of SPADs of Fig.1. The grayscale array 80 includes a plurality of SPADs, which are examples of photosensitive elements, and which are capable of detecting light of any color
(wavelength) within a predefined range (in the case of Fig.9, within the visible range). Thus, the grayscale array 80 is capable of detecting black, white, and a predefined number of grayscale/monochrome values according to a bit-length of a counter that counts photon detection events of the SPADs. The SPADs of the grayscale array 80 are labeled with a “P”, followed by an index that indicates a position of the SPAD in the grayscale array 80. In the four phases Phase #1 to Phase #4, the SPADs of the grayscale array 80 are binned according to a predefined binning pattern, which is indicated by bold lines. The image sensor 7 acquires a photon count from each bin and generates compressed image data based on the photon counts. The circuitry 2 obtains the compressed image data from the image sensor 7 and merges the compressed image data into a pre-output array 81, wherein pixel values of the pre-output array 81 are based on the photon counts from corresponding bins, as indicated by arrows in Fig.9. For example, some pixel values of the pre-output array 81 are based on an output O_P11, O_P13, O_P31, and O_P33 of Phase #1, wherein: ^^_^^^^ = ^^^^ + ^^^ଶ + ^^ଶ^ + ^^ଶଶ ^^_^^^ଷ = ^^^ଷ + ^^^ସ + ^^ଶଷ + ^^ଶସ ^^_^^ଷ^ = ^^ଷ^ + ^^ଷଶ + ^^ସ^ + ^^ସଶ ^^_^^ଷଷ = ^^ଷଷ + ^^ଷସ + ^^ସଷ + ^^ସସ Similarly, some pixel values of the pre-output array 81 are based on an output O_P12, O_P14, O_P32, and O_P34 of Phase #2, wherein: ^^_^^^ଶ = ^^^ଶ + ^^^ଷ + ^^ଶଶ + ^^ଶଷ ^^_^^^ସ = ^^^ସ + ^^^ହ + ^^ଶସ + ^^ଶହ ^^_^^ଷଶ = ^^ଷଶ + ^^ଷଷ + ^^ସଶ + ^^ସଷ ^^_^^ଷସ = ^^ଷସ + ^^ଷହ + ^^ସସ + ^^ସହ This algorithm is followed accordingly for the further phases (Phases #3 and #4) for generating the pre-output array 81. In Fig.9, the arrows that indicate an assignment of pixels from the phases #1 to #4 to the pre- output array 81 correspond to locations in the case that no parts of the scene moved between the phases. However, in the case there is movement between phases in the recorded scene, the arrows change their final location depending on a reference for the alignment (i.e., for the associating of a portion of an object for the phases). For example, if phase #4 is used as the reference for the alignment, the arrows for phase #4 in Fig.9 are as represented towards array 81,
but the arrows for the other three phases may point to other directions (i.e., towards other pixels of the pre-output array 81) than those shown in Fig 9. Depending on object movements in the scene, the alignment calculated between the different phases and the reference phase (e.g., phase #4 in this explanation) may differ from the representation in Fig.9, so that all detected objects represented by the compressed image data appear in their proper locations in the pre-output array 81. Fig.9 shows four phases of a 2x2 SPB kernel, wherein the circuitry 2 aligns all objects in an imaged scene across phases. It is noted that Fig.9 shows binning of SPADs and generation of the pre-output array 81 only for some SPADs/pixels for illustration purposes. For generating the compressed image data, the image sensor 7 performs the processing illustrated in Fig.9 for further SPADs of the array 7a accordingly. In the pre-output array 81, the phases are registered, wherein values based on photon counts from the phases are registered (arranged) in a correct location, so each object of the scene in registered in the correct position of the array 81. The pre-output array 81 has a high resolution (i.e., a resolution that corresponds to a number of SPADs of the array 7a). However, the pre-output array 81 is still blurred due to the 2x2 SPB kernel. In some cases the pre-output array 81 includes additional blur due to in-phase motion blur of the phases, e.g., in a case where a long exposure time per phase is used with respect to a speed of one or more objects in the scene. The circuitry 2 performs a decompression 82 of the pre-output array 81. The decompression 82 is an example of the upsampling 21 and the deblurring 22 of Fig.2. For the decompression 82, the circuitry 2 may use an analytical method (e.g., the decompression kernel 26) and/or the artificial neural network 18. With the decompression 82, the circuitry 2 generates an output high-resolution color image 83 with a recovered quality (which includes a reduced motion blur of moving objects). The output high-resolution color image 83 is an example of output image data. It is noted that, although the SPB cycle illustrated in Fig.9 includes four phases, the disclosure is not limited to four phases. For example, an SPB cycle may include two phases, three phases, six phases, eight phases, nine phases, or any other suitable number of phases. It is further noted that an alignment and summing, as shown in Fig.8 and 9, may be performed in a host (e.g., in Qualcomm) and/or may be performed under the image sensor (in a sensor logic, e.g., compute-in-memory (CIM)) or in a coprocessor companioning the image sensor (e.g., powerful image signal processor (ISP)).
Fig.10 illustrates an embodiment of generating output image data from compressed image data. In the embodiment of Fig.10, the circuitry 2 generates output image data based on quanta burst imaging (QBI) with four phases per SPB cycle, wherein the circuitry generates one output frame for every completed SPB cycle. Frames 90 to 94 show a scene at different points in time, as it may be captured at a native resolution of the sensor array 7a (e.g., 12 Mega-SPADs, without limiting the disclosure to this value). The image sensor 7 acquires, at 13 of Fig.2, a plurality of image frames 95 to 99 of the scene in subsequent SPB phases at a SPB resolution (e.g., with 3 Mega-pixels for 2x2 binning, without limiting the disclosure to these values). The image frames 95 to 99 have the SPB resolution, which is reduced with respect to the native sensor resolution, and are an example of compressed image data. The circuitry 2 aligns objects of the scene from the different phases to one phase. In the case of Fig.10, every four subsequent image frames 95 to 99 are merged into one output frame. Therefore, objects in the image frames 95 to 98 are aligned, as illustrated in the aligned frames 100 to 103. The image frame 99 is associated with Phase #1 of a next SPB cycle. Therefore, objects in the image frame 99 are aligned according to the next SPB cycle, as illustrated in the aligned frame 104. The circuitry 2 merges (which is an example of the upsampling at 21 of Fig.2) the aligned frames 100 to 103 to a pre-output image 105 (which is an example of the pre-output array 70 of Fig.8 and of the pre-output array 81 of Fig.9). The pre-output image 105 is still blurred (similar to being out-of-focus), as illustrated by dotted contours of the pre-output image 105. Since the aligned frame 104 is associated with another SPB cycle, the aligned frame 104 is merged with aligned frames of the other SPB cycle. The circuity 2 then generates an output image 106 (which is an example of output image data). The generating of the output image 106 is an example of the generating of output image data at 20 of Fig.2 and of the deblurring at 22 of Fig.2. The output image 106 represents a deblurred image of the scene at a point in time and with native sensor resolution (e.g., 12 Mega-pixels, without limiting the disclosure thereto). Fig.11 illustrates an embodiment of generating a plurality of aligned frames from a shifted-pixel binning cycle. In the embodiment of Fig.11, the image sensor 7 acquires compressed image frames 95 to 99 based on SPB, as described above with respect to Fig.10. However, in the
embodiment of Fig.11, the circuitry 2 generates four aligned frames from each of the image frames 95 to 98, wherein in each of the four aligned frames, the objects are aligned to their respective positions in another one of the four phases. Thus, in an aligned frame 110, the objects from the image frame 95 are aligned to their positions in the image frame 95, which has been acquired in Phase #1. In an aligned frame 111, the objects from the image frame 95 are aligned to their positions in the image frame 96, which has been acquired in Phase #2. In an aligned frame 112, the objects from the image frame 95 are aligned to their positions in the image frame 97, which has been acquired in Phase #3. In an aligned frame 113, the objects form the image frame 95 are aligned to their positions in the image frame 97, which has been acquired in Phase #4. Likewise, in aligned frames 114 to 117, the objects from the image frame 96 are aligned to their positions in the image frames 95 to 99, respectively. In aligned frames 118 to 121, the objects from the image frame 97 are aligned to their positions in the image frames 95 to 99, respectively. In aligned frames 122 to 125, the objects from the image frame 98 are aligned to their positions in the image frames 95 to 99, respectively. Thus, in the embodiment of Fig.11, the objects from the scene in the image frames 95 to 98 of the different phases are aligned to all phases of the SPB cycle. Since the SPB cycle in Fig.11 includes four phases, the circuitry 2 generates four aligned image frames 110 to 125, respectively, from each image frame 95 to 98. Fig.12 illustrates an embodiment of generating a plurality of output images from a shifted-pixel binning cycle. The circuitry 2 merges the aligned image frames 110 to 125 into four pre-output images 126 to 129 according to their associated phases. The merging of the aligned image frames 110 to 125 is an example of the upsampling at 21 of Fig.2, and the pre-output images 126 to 129 are examples of the pre-output array 70 of Fig.8 and of the pre-output array 81 of Fig.9. More concretely, the pre-output image 126 corresponds to Phase #1 and is based on merging the aligned image frames 110, 114, 118 and 122. The pre-output image 127 corresponds to Phase #2 and is based on merging the aligned image frames 111, 115, 119 and 123. The pre-output image 128 corresponds to Phase #3 and is based on merging the aligned image frames 112, 116, 120 and 124. The pre-output image 129 corresponds to Phase #4 and is based on merging the aligned image frames 113, 117, 121 and 125.
The pre-output images 126 to 129 are still blurred (similar to being out-of-focus), as illustrated by dotted contours of the pre-output images 126 to 129. The circuity 2 then generates output images 130 to 133 (which are examples of output image data). The generating of the output images 130 to 133 is an example of the generating of output image data at 20 of Fig.2 and of the deblurring at 22 of Fig.2. The output images 130 to 133 represent a deblurred image of the scene at a point in time and with native sensor resolution (e.g., 12 Mega-pixels, without limiting the disclosure thereto). More concretely, the output image 130 corresponds to Phase #1 and is based on deblurring the pre-output image 126. The output image 131 corresponds to Phase #2 and is based on deblurring the pre-output image 127. The output image 132 corresponds to Phase #3 and is based on deblurring the pre-output image 128. The output image 133 corresponds to Phase #4 and is based on deblurring the pre-output image 129. The embodiment of Fig.11 and 12 may, in some cases, introduce a delay in an output due to an amount of processing, but may generate more fps (i.e., a higher frame rate) of an output movie. For example, in the embodiment of Fig.11 and 12, four output frames (output images 130 to 133) are generated in one SPB cycle, whereas only one output frame (output image 106) is generated in one SPB cycle in the embodiment of Fig.10. Thus, the embodiment of Fig.11 and 12 may be advantageous for slow motion recordings, although it may not generate the output image data in real time in some cases. Fig.13 illustrates an embodiment of a neural network (NN) pyramidal workflow of SPB QBI based on step window processing. In the embodiment of Fig.13, input frames are recorded using a QBI array (e.g., the array 7a of Fig.1) with on-sensor SPB compression capabilities. In a scene that is imaged, objects may be in movement. In case of using SPB compressed data, a number of inputs is a multiple of a number of phases per SPB cycle, e.g., for a 2x2 SPB compression, a number of inputs is a multiple of 4. However, QBI non-compressed data are processed with this workflow in some embodiments. The image sensor 7 acquires a sequence of image frames, which are an example of compressed image data, based on SPB QBI. The circuitry 2 obtains the image frames and inputs them in an (artificial) NN 140, which is an example of the artificial neural network 18 of Fig.2. The NN 140 align-merges and enhances the input frames to generate an output frame (which is an example of output image data). A goal is that quality parameters (e.g., SNR, sharpness, etc.)
of the output frames generated by the NN 140 are similar to the ones of a one-shot picture where objects in the imaged scene are static and a predefined (e.g., sufficient) exposure is used. As illustrated in Fig.13, the circuitry 2 inputs image frames #1 to #8 to the NN 140, and the NN 140 generates an output frame #1 based on the input frames #1 to #8. Likewise, the circuitry 2 inputs image frames #9 to #16 to the NN 140, and the NN 140 generates an output frame #2 based on the input frames #9 to #16. The representation of the objects of the scene in the output frames from the NN 140 can be aligned (based on a design of the NN 140) with any of the input frames (e.g., for the output frame #1, at a position of the objects of the scene in the input frame #1, in the input frame #8, in a median frame #4 or #5, or the like), or at an interpolated position, e.g., at an average time of an acquisition time of the input frames #1 to #8. It is noted that the NN 140 receives more or less than eight input frames for generating the output frames in some embodiments. In some embodiments, the number of input frames of the NN 140 even differs from the number of phases per SPB cycle. Fig.14 illustrates an embodiment of a NN pyramidal workflow of SPB QBI based on sliding window processing. The sliding window processing of Fig.14 basically corresponds to the step window processing of Fig.13; however, a difference between the sliding window processing and the step window processing is that, in the sliding window processing of Fig.14, in each window one new frame of the burst is inputted to the NN 141, as indicated by a circle around the input frame #9 in the case for the output frame #2, by a circle around the input frame #10 in the case for the output frame #3, and by a circle around the input frame #11 in the case for the output frame #4. Thus, the NN 141 generates the output frames based on seven input frames that have already been used for a respective previous output frame, and the one new input frame. The sliding window processing may iterate or the decompressed-enhanced image frames, or may iterate over the compressed (and non-enhanced) image frames while the decompressed-enhanced image frames may be provided to a user. It is noted that, in some embodiments, two, three or more new input frames are inputted to the NN 141 in each window. It is further noted that a sliding window size is not limited to eight input image frames, but includes any suitable number of input image frames, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more. Fig.15 illustrates an embodiment of a SPB QBI data processing based on a recurrent NN sequential workflow. The processing of Fig.15 is an example of the generating of output image
data at 20 of Fig.2. Fig.15 shows an embodiment that is based on step window processing with four phases per SPB cycle. Switches in Fig.15 (labeled as “Yes/No”) are selected as “Yes” when the input frames are SPB-compressed, otherwise as “No”. When a switch of a block is selected as “Yes”, the processing of the block is performed; when the switch is selected as “No”, the processing of the block is not performed and an input of the block is outputted unchanged. The processing of Fig.15 is based on image frames (input frames) #4n, #4n+1, #4n+2 and #4n+3, which are examples of compressed image data. For example, when n=0, then 4n=0, 4n+1=1, 4n+2=2, and 4n+3=3. When n=1, then 4n=4, 4n+1=5, 4n+2=6, and 4n+3=7. The image frame #4n+3 corresponds to Phase #4 and is inputted to an upsampling 151. The upsampling 151 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 152. The image frame #4n corresponds to Phase #1 and is inputted to an upsampling 153. The upsampling 153 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 154. The alignment NN 154 further receives the image frame #4n+3 as a reference, and provides its output to a mask 155. The mask 155 is performed if the input frames are SPB-compressed, and provides its output to a sum 156. The image frame #4n+1 corresponds to Phase #2 and is inputted to an upsampling 157. The upsampling 157 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 158. The alignment NN 158 further receives the image frame #4n+3 as a reference, and provides its output to a mask 159. The mask 159 is performed if the input frames are SPB-compressed, and provides its output to the sum 156. The image frame #4n+2 corresponds to Phase #3 and is inputted to an upsampling 160. The upsampling 160 is performed if the input frames are SPB-compressed, and provides its output as a reference to an alignment NN 161. The alignment NN 161 further receives the image frame #4n+3 as a reference, and provides its output to a mask 162. The mask 162 is performed if the input frames are SPB-compressed, and provides its output to the sum 156. The image frame #4n+3 is further inputted to an upsampling 163. The upsampling 163 is performed if the input frames are SPB-compressed, and provides its output as a reference to a mask 164. The mask 164 is performed if the input frames are SPB-compressed, and provides its output to the sum 156.
The sum 156 adds its inputs and provides its result to a sum 165. The sum 165 further receives a previous output frame aligned to the input frame 4n+3 from Phase #4, and adds its inputs. The sum 165 provides its result to a decompression & enhancement NN 166. The decompression & enhancement NN 166 is performed if the input frames are SPB- compressed, and decompressed and enhances its input. The decompression performed by the decompression & enhancement NN 166 includes generating a (deblurred) image by decompressing the SPB-compression of the input frames. Examples of an enhancement performed by the decompression & enhancement NN 166 include alignment corrections, denoising, SPB-decompression, etc. The decompression & enhancement NN 166 generates an enhanced frame. A feedback loop provides the enhanced frame as a previous output frame to the alignment NN 152 for a subsequent iteration of the processing of Fig.15 (e.g., for a subsequent SPB cycle). The alignment NN 152 generates, as its output, the previous output frame aligned to the input frame #4n+3 from Phase #4, and provides its output to the sum 165. The decompression & enhancement NN 166 further provides the enhanced frame to a decompression & enhancement NN 167. The decompression & enhancement NN 167 generates an output frame (which is an example of output image data) by performing further decompression & enhancement processing based on the enhanced frame. The decompression & enhancement NN 167 enhances a visualization of the enhanced frame outputted by the decompression & enhancement NN 166 such that the output frame (with an enhanced visualization) can be provided to a user without affecting the feedback loop by the enhanced visualization. For example, in a case where the enhanced visualization introduces an undesirable effect, aligned frames may be accumulated in the feedback loop without performing the enhancing and/or decompressing that causes the undesirable effect in the decompression & enhancement NN 167, such that a signal-to-noise ratio (SNR) may be increased in each iteration in the background, while the enhanced visualization may be applied in the decompression & enhancement NN 167. It is noted that, in some embodiments, the decompression & enhancement NN 167 is omitted (e.g., if the decompression and/or enhancement does not cause an undesirable effect in the feedback loop), and the enhanced frame outputted by the decompression & enhancement NN 166 corresponds to the output frame.
As a variation of the embodiment of Fig.15, the processing may be based on sliding window processing instead of step window processing. In the sliding window processing, instead of updating all the phases after a new updated output frame, only an oldest phase is renovated. Further, one output frame per input phase may be generated (for any one of the step window processing and the sliding window processing) by any one of the following options A and B. Option A is based on calculating different alignments to each of the phases, and applying enhancements to all the output frames (e.g., four output frames instead of one). Option B is based on obtaining four motion-flow maps from the alignment NN, and calculating remaining motion-flow maps by simple subtraction of the four motion-flow maps. Enhancements may be performed for all of the motion-flow maps. Fig.16 illustrates an embodiment of a NN pyramidal workflow for non-compressed QIS data. For the embodiments of Fig.13 and 14, a number of inputs larger than one has been described to show an architecture that can operate with SPB-compressed QIS data and with non-compressed QIS data. However, for the specific case of non-compressed QIS data, the architecture of Fig.13 and 14 can be implemented with a minimum of two inputs. The number of inputs depends on an optimization that should be obtained with the respective NN, e.g., a number and quality of outputs. In the embodiment of Fig.16, a step-window processing with two inputs and one output of a NN 170 is performed. Apart from that, the embodiment of Fig.16 corresponds to the embodiments of Fig.13 and 14. Fig.17 illustrates an embodiment of a recurrent NN workflow for non-compressed QIS data. For the embodiment of Fig.15, a number of inputs larger than one has been described to show an architecture that can operate with SPB-compressed QIS data and with non-compressed QIS data. However, for the specific case of non-compressed QIS data, the architecture of Fig.15 can be implemented with a minimum of two inputs. The number of inputs depends on an optimization that should be obtained with the respective NN, e.g., a number and quality of outputs. In the embodiment of Fig.17, a step-window processing with two inputs and one output is performed. As a first input, an input frame n (an image frame of non-compressed image data) is provided to an alignment NN 171 as a reference. As a second input, a previous output frame (an output frame from a previous iteration of the step-window processing of Fig.17) is provided to the alignment NN 171. The alignment NN 171 generates, as its output, the previous output frame aligned to the input frame n, and provides its output to a sum 172. The sum 172 receives the input frame n as a further input, adds its inputs, and provides its result to an enhancement
NN 173. Examples of enhancements applied by the enhancement NN 173 include alignment corrections, denoising, etc. The enhancement NN 173 generates an output frame (which is an example of output image data). In a feedback loop, the output frame is provided as a previous output frame to the alignment NN 171 in a subsequent iteration. Apart from that, the embodiment of Fig.17 corresponds to the embodiment of Fig.15. Fig.18 illustrates embodiments of pipelines. Different pipelines may be used to enhance QBI images. Each pipeline has different objectives and can support one mode or multiple modes, e.g., low-resolution mode (LR), high-resolution mode (HR), and compressed-resolution mode (CR) when using shifted-pixel binning (SPB) compression. A first pipeline 180 is for the CR mode. Both color (Red-Green-Blue (RGB)) and monochrome/grayscale data are supported via loading and preprocessing 181. The pipeline 180 can have two or three blocks. In a first block 182, multiple input images are first aligned 183 and an optical flow is obtained for each input. Then each input is warped 184 to a latest frame according to the optical flow (the input may also be warp to any other frame, e.g., to a first frame). All the warped images are summed 185 based on a mask that is related to the CR pattern (shifted-pixel binning around a pivoting pixel). An output at this step is a low-resolution image (e.g., in a case of 2x2 SPB, a resolution may be reduced 2x2) with a negative effect to appear been acquired out-of-focus frames. In a second block 186, a decompressor is used to recover an in-focus HR image. A third block 187 is optional, where an enhancement NN (#1) is used to further improve an image quality (e.g., denoiser, sharpening, color adjustment, etc.). It is noted that an output 188 of the pipeline 180 may be used as a feedback image in order to continue improving the output 188. A second pipeline 190 is for the LR/HR/CR modes. Both color/RGB and monochrome/grayscale data are supported via loading and preprocessing 191. The pipeline 190 has two blocks. In a first block 192, multiple input images are first aligned 193 and an optical flow is obtained for each input. Then each input is warped 194 to a latest frame according to the optical flow (the input may also be warped to any other frame, e.g., to a first frame). All the warped images are summed 195 together. Especially for CR, a mask that is related to the CR pattern (shifted-pixel binning around a pivoting pixel) is used for the sum.
In a second block 197, an enhancement NN (#2) is used to decompress the images (in case of CR) and to improve an image quality (e.g., denoiser, sharpening, color adjustment, etc.). It is noted that an output 198 of the pipeline 190 may be used as a feedback image in order to continue improving the output 198. A third pipeline 200 is for the LR/HR/CR modes, but can support demosaic. After loading and preprocessing data 201, mosaic figures may be obtained, e.g., in a pattern of RGGB. The pipeline 200 has three blocks. In a first block 202, align multiple input images are first aligned 203 and an optical flow is obtained for each input. Then each input is warped 204 to a latest frame according to the optical flow (the input may also be warped to any other frame, e.g., to a first frame). All the warped images are summed together. Especially for CR, a mask that is related to the CR pattern (shifted- pixel binning around a pivoting pixel) is used for the sum. A second block 206 is only for CR mode, where a decompressor is used to recover HR. In a third block 207, a denoiser is used to improve an image quality. An enhancement NN (#3) receives mosaic inputs and generates RGB output, i.e., it supports demosaic. It is noted that an output 208 of the pipeline 200 may be used as a feedback image in order to continue improving the output 208. For each pipeline and each mode, different datasets may be used to train different neural networks. Regarding the data loading and preprocessing 181, 191 and 201, respectively, the following is noted. Specific datasets may be generated to train different neural networks in different pipelines, e.g., alignment net, decompressor, and enhancement. The key point may be to make sure that a training dataset and realistic testing data have similar characteristics, e.g., a similar distribution. A ground truth of the datasets may contain images that also show moving objects, in order to include those characteristics in the training dataset. Overall, a Quantum sensor may work in very low light considering its zero readout noise, i.e., it may have Poisson noise. As a result, for all the modes, datasets are generated following a Poisson distribution. A wide range of Poisson variances is used to train multiple light conditions, since the lower the light, the higher the noise level may be. Different models are trained with different datasets for different pipeline and modes. In details:
For a decompressor (e.g., the decompressor 186 or 206) and an enhancement of CR mode (e.g., the enhancement NN 187, 197 or 207), first Poisson noise is added to ground-truth images, then color is converted to monochrome if monochrome mode is used. Then a data sequence is compressed using SPB, and the compressed data sequence is passed through the align-warp- mask_sum pipeline 182, 192 or 202, respectively, and a dataset is obtained that has the artifacts (i.e., lower resolution and out-of-focus result) of CR. For an enhancement of HR mode (e.g., the enhancement NN 197 or 207), Poisson noise is first added to ground-truth images, then color is converted to monochrome if white-black mode is used. Then a data sequence is passed through the align-warp-sum pipeline 192 or 202, respectively, and a dataset of HR artifacts is obtained. For an enhancement of LR mode (e.g., the enhancement NN 197 or 207), Poisson noise is first added to ground-truth images, then color is converted to monochrome if white-black mode is used. Then a data sequence is subsampled, and the processed sequence is passed through the align-warp-sum pipeline 192 or 202, respectively, and a dataset of LR artifacts is obtained. Regarding the align-warp-sum block 182, 192 and 202, respectively, the following is noted. An alignment net is used to obtain a pixel-wise movement between two adjacent frames. Given two images, ^^^ and ^^^, the alignment net estimates a motion of a pixel point from ^^^ to ^^^. The output of the
net is called optical flow: ^^ = Ϝ(^^^, ^^^), where ^^ denotes the optical flow and Ϝ is the alignment net. The output flow contains a movement vector of each pixel in horizontal and vertical directions. To train the alignment net, noisy ^^^ and ^^^ are first created, and the optical flow ^^ is used as the ground-truth. It is noted that a range of
levels may be wide to suit complex realistic scenes. Based on the optical flow, the objective frame is mapped to a template frame. The mapping is realized via warp, where interpolation is used to make the mapping finer. The warped objective image is summed together with the template image to enhance an image quality. Regarding the decompressor block 186 and 206, respectively, the following is noted. The decompressor is used to recover a HR image from a CR sequence. In CR mode, the CR sequence is compressed from an HR sequence. For example, a CR sequence of 4 frames (n=4) may be obtained from HR: (^^^, ^^^, … , ^^^) = Compress(^^^, ^^^, … , ^^^),
where Compress denotes a procedure of spatial compressing. A CR image ^^^ may have a higher SNR but lower modulation transfer function (MTF) compared to an HR image ^^^. A decompressor is used to recover the spatial information. Specifically, summed CR images may be inputted and a summed HR image may be obtained: ^ ^ ^ ^^^ = Ϝ ^^^ where Ϝ is the decompressor net.
may be inputted and a summed HR image may be obtained: ^ ^ ^^^ = Ϝ(^^^, ^^^, … , ^^^). The decompressor includes a U-
loss function is designed to improve a contrast and maintain an SNR. To make sure that an estimated image is close to the ground-truth, a mean squared error (MSE) is used as data fidelity term in the loss. To obtain a high contrast, a combination of a structural similarity index measure (SSIM) is introduced. Furthermore, outputs of the decompressor are passed to the enhancement NN 187 or 207, respectively, to further enhance an image quality. As a result, the decompressor output should have similar characteristics to the inputs of the alignment. To realize this, also a similarity loss is introduced, which is a L1-norm. It is noted that an L1-norm is chosen here to make sure that the output and the ground-truth have a similar distribution. In conclusion, the loss function is: ℒ = SSIM(^^, GT) + ^^^ ⋅ MSE(^^, GT) + ^^ଶ ⋅ ‖^^ − GT‖^, where ^^ is the net output, GT denotes the ground-truth (summed HR image), and ‖⋅‖^ is the L1- norm for similar loss. Regarding the enhancement NN block 187, 197 and 207, respectively, the following is noted. After alignment or decompressor, the image may still have low SNR, especially at low light conditions and not enough inputs. To improve the SNR, the enhancement NN 187, 197 and 207 is introduced in the pipelines. The enhancement NN 187, 197 and 207 is trained by inputting low SNR inputs and a high SNR ground-truth. As mentioned, HR or CR artifacts may be added to the training dataset to improve the training performance.
The enhancement NN 187, 197 and 207 includes a U-net like net. A specific loss function is designed to improve both high SNR and contrast. In detail, peak signal-to-noise ratio (PSNR) is introduced in the loss function to make sure that high SNR outputs are obtained. To get a high contrast, a combination of SSIM and sober-based edge loss functions is introduced. Furthermore, the outputs of the enhancement NN 187, 197 and 207 are fed back to a next level alignment in order to continue enhancing an image quality. As a result, an output of the enhancement NN 187, 197 and 207 may have similar characteristics to the inputs of the alignment. To realize this, also a similarity loss is introduced, which is a L2-norm. It is noted that that an L2-norm is chosen here to maintain the SNR. In conclusion, the loss function is: ℒ = PSNR(^^, GT) + ^^^ ⋅ SSIM(^^, GT) + ^^ଶ ⋅ Edge(^^, GT) + ^^ଷ ⋅ ‖^^ − GT‖ଶ, where ^^ is the net output, GT denotes the ground-truth, and ‖⋅‖ଶ is the L2-norm for similar loss. Consequently, in some embodiments, instead of having a bit-counter (e.g., 8 bits/counter) per pixel (photosensitive element; e.g., SPAD) in an image sensor, photosensitive elements that belong to a same color channel (pixel binning in each color-filter channel) are grouped/binned (e.g., 2x2 for illustration purposes, without limiting the disclosure thereto) to share a bit-counter per group of pixels (photosensitive elements) (e.g., a RGGB color filter may require four separated groups of pixels and, thus, four independent bit-counters, one per color-filter). In a case of a grey-sensor array, only one bit-counter per pixel-binning may be provided. A compression kernel size in a pixel binning may be selected depending on a compression required, e.g., 2x2, 3x3, 4x4, 5x5, etc., without limiting the disclosure to these values. Based on each compression ratio, different image frames (phases) may be acquired, wherein a location of the compression kernel location may be shifted in each phase, e.g., using a pivoting pixel as a reference. For example, in a case of using a 2x2 pixel binning, 2x2=4 different images (phases) may be acquired, wherein the compression kernel position may be shifted in each phase; in a case of using 3x3 pixel binning, 3x3=9 different images (phases) may be acquired, wherein the compression kernel position may be shifted in each phase. Thus, a shifted-binning pixel (SPB) technique is applied in some embodiments. A size of the image phases may be smaller than an original image (native sensor resolution), e.g., using a 2x2 kernel, each image-phase may be four times smaller than the original image (native sensor resolution). It is noted that the disclosure is not limited to these values.
In each group of phases (e.g., 2x2 phases, without limiting the disclosure to this value), as many frames (group of phases) may be recorded in the burst as many are necessary to achieve a final output quality (e.g., SNR, sharpness, etc.). Once the phases are acquired, a location of a reference pixel (or of reference pixel values) of each phase SPB-group (compressed resolution) may be registered (situated) in a corresponding location in a pre-output array (which may have an original resolution, e.g., native sensor resolution). In a case of a static scene, where there are no object/scene movements, the locations of the pixel values of each phase may correspond to locations of the shifted compression kernels (SPB). In a case of a moving scene, where there are object/scene movements, a displacement (motion estimation) of the objects in the scene between phases may be obtained and then situated/located (registered) in the pre-output array, also considering a compression kernel position in each phase. In some embodiments, each pixel-value of each phase only contributes to one single pixel of the pre-output array. At this point (after the acquisition of phases, motion estimation, and registration), the pre-output array image (which may have an original resolution, i.e., non-compressed image size) may differ from an original (non-compressed) image, e.g., it may seem blurred/out-of-focus due to the SPB compression. However, since the applied kernel compression may be known, the SPB compressed image may be de-compressed. The de-compression may be based on applying analytical solutions, e.g., based on a convolution on the pre-output image using an inverted compression kernel (decompression kernel), that may allow to recover the original image. The de-compression may also be based on applying neural-network based solutions, e.g., making use of the known SPB kernel, a dataset may be prepared for training a NN, which may be capable to recover the de-compressed image without blur/out-of-focus. Beyond implementing in a hardware of the sensor 7 a possibility of different SPB-compression kernel sizes (e.g., 2x2, 3x3, 4x4, etc., without limiting the disclosure to these values), also simultaneous SPB-compression for different kernel sizes may be implemented, e.g., each sensor read out may provide several images with different SPB-compression ratios (e.g., 3x3, 5x5, 7x7, etc., without limiting the disclosure to these values) simultaneously. This hardware feature may provide an output that may correspond with an image pyramid, that may be used to speed up the
NN processing, e.g., simplifying a number of convolutions needed for object alignment between frames. Due to large compression ratios, simultaneous SPB-compression for different kernel sizes may have a minor impact on a data rate (from the image sensor 7 to the host/circuitry 2). In some embodiments, an output of a QBI NN that uses a compressed resolution of 2x2 SPB achieves a same image quality as when using a non-compressed resolution. Accordingly, some embodiments provide an on-sensor lossless hardware compression technique that may allow to reduce a data-rate from an image sensor to a host, such that a larger resolution array may be possible with a same sensor-host communication technology channel (e.g., MIPI). The technique may further allow to reduce a number of bit-counters necessary in an array of photosensitive elements of the image sensor, thus avoiding a logic-layer to be a limiting factor for a final stack sensor. A pixel pitch of the sensor array may be reduced, and the necessary number of bit-counters may still fit under the array of photosensitive elements. A decompression according to the technique may be simply implemented in an existing NN without a significant computational cost impact. QBI and QBI with SPB-compressed data may be applied based on two main approaches (i.e., based on an artificial NN or based on an analytical solution). In some instances, a SPB-compression hardware implementation may be very complex in a non- SPAD based pixel array because of a complexity of designing/fabricating required hardware multiple-connections while keeping a signal integrity. Using a SPAD-based pixel array may thus facilitate the SPB-compression hardware implementation. The SPB-compression hardware implementation may be simpler than compressive sensing techniques that may require more complex interconnexions, wherein these interconnexions may even need to be adapted to each scene property. SPB-compression is especially suitable for QBI in some embodiments. In QBI, several images may be acquired to obtain a good quality image output. SPB-compression may use phases (images) to recover an original resolution. SPB-decompression may be simple (to the contrary than compressive sensing in some instances). Also, as described herein, the SPB-decompression may easily be implemented in an enhancement NN. Thus, a computational cost for the decompression may not increase significantly.
Fig.19 illustrates an embodiment of a general-purpose computer 250. The general-purpose computer 250 can be implemented such that it can basically function as any type of electronic device (e.g., the electronic device 1 of Fig.1), for example, a camera, a smartphone, smart glasses, a head-mounted display, a smartwatch, a mobile phone, a mobile tablet, a notebook, a terminal device or the like. The general-purpose computer 250 is an example of an information processing apparatus that includes circuitry that is configured to perform the method according to the present technology (e.g., the method 10 of Fig.2). The computer has components 251 to 261, which can form a circuitry, such as the circuitry 2 of Fig.1 and/or any one its units 3 to 6, or the like, as described herein. Embodiments which use software, firmware, programs or the like for performing the methods as described herein can be installed on computer 250, which is then configured to be suitable for the concrete embodiment. The computer 250 has a CPU 251 (Central Processing Unit), which can execute various types of procedures and methods as described herein, for example, in accordance with programs stored in a read-only memory (ROM) 252, stored in a storage 257 and loaded into a random-access memory (RAM) 253, stored on a medium 260 which can be inserted in a respective drive 259, etc. Furthermore, the computer 250 includes an artificial intelligence (AI) processor 251a. The AI processor 251a may include a graphics processing unit (GPU) and/or a tensor processing unit (TPU). The AI processor 251a may be configured to execute an AI model (e.g., an artificial neural network), for example, the artificial neural network 18 of Fig.2. The CPU 251, the ROM 252 and the RAM 253 are connected with a bus 261, which in turn is connected to an input/output interface 254. The number of CPUs, memories and storages is only exemplary, and the skilled person will appreciate that the computer 250 can be adapted and configured accordingly for meeting specific requirements which arise when it functions as an information processing apparatus according to the present technology. At the input/output interface 254, several components are connected: an input 255, an output 256, the storage 257, a communication interface 258 and the drive 259, into which a medium 260 (compact disc (CD), digital video disc (DVD), universal serial bus (USB) flash drive, secure digital (SD) card, CompactFlash (CF) memory, or the like) can be inserted. The input 255 can be a pointer device (mouse, graphic table, or the like), a keyboard, a microphone, a camera, a touchscreen, an eye-tracking unit etc.
The output 256 can have a display (liquid crystal display (LCD), cathode ray tube (CRT) display, light-emitting diode (LED) display, electronic paper, etc.; e.g., included in a touchscreen), loudspeakers, etc. The storage 257 can have a hard disk drive (HDD), a solid-state drive (SSD), a flash drive and the like. The communication interface 258 can be adapted to communicate, for example, via universal serial bus (USB), MIPI, CSI, a serial port (RS-232), parallel port (IEEE 1284), a local area network (LAN; e.g., ethernet), wireless local area network (WLAN; e.g., Wi-Fi, IEEE 802.11), mobile telecommunications system (GSM, UMTS, LTE, NR etc.), Bluetooth, near-field communication (NFC), ZigBee, infrared, etc. It should be noted that the description above only pertains to an example configuration of computer 250. Alternative configurations may be implemented with additional or other sensors, storage devices, interfaces or the like. For example, the communication interface 258 may support other radio access technologies than the mentioned UMTS, LTE and NR. It should be recognized that the embodiments describe methods with an exemplary ordering of method steps. The specific ordering of method steps is however given for illustrative purposes only and should not be construed as binding. Changes of the ordering of method steps may be apparent to the skilled person. Please note that the division of the circuitry 2 into units 3 to 6 is only made for illustration purposes and that the present disclosure is not limited to any specific division of functions in specific units. For instance, the circuitry 2 could be implemented by a respective programmed processor, field programmable gate array (FPGA) and the like. The methods disclosed herein can also be implemented as a computer program causing a computer and/or a processor, such as the circuitry 2 discussed above, to perform the method, when being carried out on the computer and/or processor. In some embodiments, also a non- transitory computer-readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the method described to be performed. All units and entities described in this specification and claimed in the appended claims can, if not stated otherwise, be implemented as integrated circuit logic, for example on a chip, and functionality provided by such units and entities can, if not stated otherwise, be implemented by software.
In so far as the embodiments of the disclosure described above are implemented, at least in part, using software-controlled data processing apparatus, it will be appreciated that a computer program providing such software control and a transmission, storage or other medium by which such a computer program is provided are envisaged as aspects of the present disclosure. Note that the present technology can also be configured as described below. (1) Circuitry, configured to: obtain, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data. (2) The circuitry of (1), wherein the deblurring of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases. (3) The circuitry of (1) or (2), wherein the circuitry is configured to input at least a part of the compressed image data to an artificial neural network and to execute the artificial neural network; wherein the artificial neural network is configured to: receive compressed image data as input; and perform the associating for phases represented by the received compressed image data. (4) The circuitry of (3), wherein the inputted compressed image data are associated with a predefined number of phases; and wherein the artificial neural network is further configured to perform the generation of the output image data based on the inputted compressed image data.
(5) The circuitry of (3) or (4), wherein the inputted compressed image data represent at least one of the plurality of phases; and wherein the artificial neural network is further configured to: receive, as input, previously determined output image data; and associate the portion of the object between the inputted compressed image data and the previously determined output image data. (6) The circuitry of (5), wherein the artificial neural network is further configured to weight the previously determined output image data differently than the received part of the compressed image data. (7) The circuitry of any one of (1) to (6), wherein the output image data include a frame whose number of pixels exceeds a number of photosensitive elements that have acquired the photon count. (8) The circuitry of any one of (1) to (7), wherein the shifted-pixel binning is based on: binning the photosensitive elements at a plurality of kernel sizes; and acquiring the photon count for the correspondingly binned photosensitive elements for each respective kernel size simultaneously. (9) The circuitry of any one of (1) to (8), wherein the circuitry is further configured to: determine a number of phases that is associated with a predefined image quality of the output image data; and cause the image sensor to acquire the photon count for the determined number of phases. (10) The circuitry of any one of (1) to (9), wherein the deblurring includes estimating a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases. (11) The circuitry of (10), wherein the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, the decompression
kernel being configured to generate pixel values of the output image data based on a change of the pixel values across the plurality of phases. (12) A method, comprising: obtaining, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associating, for at least some of the plurality of phases, a portion of an object represented by the compressed image data; and generating output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data. (13) The method of (12), wherein the deblurring of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases. (14) The method of (12) or (13), wherein the method comprises inputting at least a part of the compressed image data to an artificial neural network and executing the artificial neural network; wherein the artificial neural network performs: receiving compressed image data as input; and performing the associating for phases represented by the received compressed image data. (15) The method of (14), wherein the inputted compressed image data are associated with a predefined number of phases; and wherein the artificial neural network further performs the generation of the output image data based on the inputted compressed image data. (16) The method of (14) or (15), wherein the inputted compressed image data represent at least one of the plurality of phases; and
wherein the artificial neural network further performs: receiving, as input, previously determined output image data; and associating the portion of the object between the inputted compressed image data and the previously determined output image data. (17) The method of (16), wherein the artificial neural network further weights the previously determined output image data differently than the received part of the compressed image data. (18) The method of any one of (12) to (17), wherein the output image data include a frame whose number of pixels exceeds a number of photosensitive elements that have acquired the photon count. (19) The method of any one of (12) to (18), wherein the shifted-pixel binning is based on: binning the photosensitive elements at a plurality of kernel sizes; and acquiring the photon count for the correspondingly binned photosensitive elements for each respective kernel size simultaneously. (20) The method of any one of (12) to (19), wherein the method further comprises: determining a number of phases that is associated with a predefined image quality of the output image data; and causing the image sensor to acquire the photon count for the determined number of phases. (21) The method of any one of (12) to (20), wherein the deblurring includes estimating a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases. (22) The method of (21), wherein the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, the decompression kernel generating pixel values of the output image data based on a change of the pixel values across the plurality of phases.
(23) A computer program comprising program code causing a computer to perform the method according to anyone of (12) to (22), when being carried out on a computer. (24) A non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method according to anyone of (12) to (22) to be performed. (25) Circuitry, configured to: obtain, from an image sensor, processed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object represented by the processed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the processed image data. (26) Circuitry, configured to: obtain, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning; associate, for at least some of the plurality of shifted-pixel binning phases, a portion of an object represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of shifted-pixel binning phases, image data that are based on the compressed image data.
Claims
CLAIMS 1. Circuitry, configured to: obtain, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly binned photosensitive elements; associate, for at least some of the plurality of phases, a portion of an object represented by the compressed image data; and generate output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
2. The circuitry of claim 1, wherein the deblurring of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases.
3. The circuitry of claim 1, wherein the circuitry is configured to input at least a part of the compressed image data to an artificial neural network and to execute the artificial neural network; wherein the artificial neural network is configured to: receive compressed image data as input; and perform the associating for phases represented by the received compressed image data.
4. The circuitry of claim 3, wherein the inputted compressed image data are associated with a predefined number of phases; and wherein the artificial neural network is further configured to perform the generation of the output image data based on the inputted compressed image data.
5. The circuitry of claim 3, wherein the inputted compressed image data represent at least one of the plurality of phases; and wherein the artificial neural network is further configured to:
receive, as input, previously determined output image data; and associate the portion of the object between the inputted compressed image data and the previously determined output image data.
6. The circuitry of claim 5, wherein the artificial neural network is further configured to weight the previously determined output image data differently than the received part of the compressed image data.
7. The circuitry of claim 1, wherein the shifted-pixel binning is based on: binning the photosensitive elements at a plurality of kernel sizes; and acquiring the photon count for the correspondingly binned photosensitive elements for each respective kernel size simultaneously.
8. The circuitry of claim 1, wherein the circuitry is further configured to: determine a number of phases that is associated with a predefined image quality of the output image data; and cause the image sensor to acquire the photon count for the determined number of phases.
9. The circuitry of claim 1, wherein the deblurring includes estimating a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases.
10. The circuitry of claim 9, wherein the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, the decompression kernel being configured to generate pixel values of the output image data based on a change of the pixel values across the plurality of phases.
11. A method, comprising: obtaining, from an image sensor, compressed image data generated by the image sensor based on shifted-pixel binning, wherein the shifted-pixel binning is based on: binning photosensitive elements of the image sensor for a plurality of phases, and acquiring in each of the plurality of phases a photon count for the correspondingly
binned photosensitive elements; associating, for at least some of the plurality of phases, a portion of an object represented by the compressed image data; and generating output image data by deblurring, based on the association of the portion of the object for at least some of the plurality of phases, image data that are based on the compressed image data.
12. The method of claim 11, wherein the deblurring of the image data includes transforming a position of the portion of the object of each phase of the plurality of phases such that the position of the portion of the object matches across the plurality of phases.
13. The method of claim 11, wherein the method comprises inputting at least a part of the compressed image data to an artificial neural network and executing the artificial neural network; wherein the artificial neural network performs: receiving compressed image data as input; and performing the associating for phases represented by the received compressed image data.
14. The method of claim 13, wherein the inputted compressed image data are associated with a predefined number of phases; and wherein the artificial neural network further performs the generation of the output image data based on the inputted compressed image data.
15. The method of claim 13, wherein the inputted compressed image data represent at least one of the plurality of phases; and wherein the artificial neural network further performs: receiving, as input, previously determined output image data; and associating the portion of the object between the inputted compressed image data and the previously determined output image data.
16. The method of claim 15, wherein the artificial neural network further weights the previously determined output image data differently than the received part of the compressed image data.
17. The method of claim 11, wherein the shifted-pixel binning is based on: binning the photosensitive elements at a plurality of kernel sizes; and acquiring the photon count for the correspondingly binned photosensitive elements for each respective kernel size simultaneously.
18. The method of claim 11, wherein the method further comprises: determining a number of phases that is associated with a predefined image quality of the output image data; and causing the image sensor to acquire the photon count for the determined number of phases.
19. The method of claim 11, wherein the deblurring includes estimating a representation of the portion of the object in the output image data at a predefined point in time based on acquisition times associated with the plurality of phases.
20. The method of claim 19, wherein the estimating includes: determining, based on the association, pixel values of the image data that correspond to the portion of the object, wherein the pixel values are based on the photon counts; and applying a decompression kernel to the determined pixel values, the decompression kernel generating pixel values of the output image data based on a change of the pixel values across the plurality of phases.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP24166250.1 | 2024-03-26 | ||
| EP24166250 | 2024-03-26 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025201947A1 true WO2025201947A1 (en) | 2025-10-02 |
Family
ID=90482090
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2025/057234 Pending WO2025201947A1 (en) | 2024-03-26 | 2025-03-17 | Circuitry and method |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025201947A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170358103A1 (en) * | 2016-06-09 | 2017-12-14 | California Institute Of Technology | Systems and Methods for Tracking Moving Objects |
| US11711628B2 (en) * | 2021-05-28 | 2023-07-25 | Microsoft Technology Licensing, Llc | Systems and methods for obtaining color imagery using single photon avalanche diodes |
| WO2023148148A1 (en) * | 2022-02-01 | 2023-08-10 | Sony Semiconductor Solutions Corporation | Imaging device and method |
-
2025
- 2025-03-17 WO PCT/EP2025/057234 patent/WO2025201947A1/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20170358103A1 (en) * | 2016-06-09 | 2017-12-14 | California Institute Of Technology | Systems and Methods for Tracking Moving Objects |
| US11711628B2 (en) * | 2021-05-28 | 2023-07-25 | Microsoft Technology Licensing, Llc | Systems and methods for obtaining color imagery using single photon avalanche diodes |
| WO2023148148A1 (en) * | 2022-02-01 | 2023-08-10 | Sony Semiconductor Solutions Corporation | Imaging device and method |
Non-Patent Citations (3)
| Title |
|---|
| CHO JIHYUN ET AL: "A 3-D Camera With Adaptable Background Light Suppression Using Pixel-Binning and Super-Resolution", IEEE JOURNAL OF SOLID-STATE CIRCUITS, IEEE, USA, vol. 49, no. 10, 1 October 2014 (2014-10-01), pages 2319 - 2332, XP011559752, ISSN: 0018-9200, [retrieved on 20140922], DOI: 10.1109/JSSC.2014.2340377 * |
| IWABUCHI KIYOTAKA ET AL: "Image Quality Improvements Based on Motion-Based Deblurring for Single-Photon Imaging", IEEE ACCESS, IEEE, USA, vol. 9, 12 February 2021 (2021-02-12), pages 30080 - 30094, XP011839663, [retrieved on 20210222], DOI: 10.1109/ACCESS.2021.3059293 * |
| MA SIZHUO ET AL: "Quanta burst photography", ACM TRANSACTIONS ON GRAPHICS, ACM, NY, US, vol. 39, no. 4, 8 July 2020 (2020-07-08), pages 79:1 - 79:16, XP059526046, ISSN: 0730-0301, DOI: 10.1145/3386569.3392470 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12182976B2 (en) | Image processing method, smart device, and computer readable storage medium | |
| Bao et al. | Memc-net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement | |
| US12008797B2 (en) | Image segmentation method and image processing apparatus | |
| Pearl et al. | Nan: Noise-aware nerfs for burst-denoising | |
| US12148123B2 (en) | Multi-stage multi-reference bootstrapping for video super-resolution | |
| US10579908B2 (en) | Machine-learning based technique for fast image enhancement | |
| CN102388402B (en) | Image processing apparatus and image processing method | |
| US8379120B2 (en) | Image deblurring using a combined differential image | |
| CN106920221B (en) | Take into account the exposure fusion method that Luminance Distribution and details are presented | |
| KR101664123B1 (en) | Apparatus and method of creating high dynamic range image empty ghost image by using filtering | |
| US9280811B2 (en) | Multi-scale large radius edge-preserving low-pass filtering | |
| Vitoria et al. | Event-based image deblurring with dynamic motion awareness | |
| CN114627034A (en) | Image enhancement method, training method of image enhancement model and related equipment | |
| CN113379609A (en) | Image processing method, storage medium and terminal equipment | |
| Kim et al. | Joint demosaicing and deghosting of time-varying exposures for single-shot hdr imaging | |
| CN116205822B (en) | Image processing method, electronic device and computer readable storage medium | |
| CN115471417B (en) | Image noise reduction processing method, device, equipment, storage medium and program product | |
| Shivaraju et al. | A new parallel DSP hardware compatible algorithm for noise reduction and contrast enhancement in video sequence using Zynq-7020 | |
| WO2025201947A1 (en) | Circuitry and method | |
| CN114529775A (en) | Model training method and device, computer equipment and storage medium | |
| CN117768774A (en) | Image processor, image processing method, photographing device and electronic device | |
| Maksymiv et al. | Methods of video quality-improving | |
| CN117710210A (en) | Method and apparatus for super resolution | |
| CN115187488A (en) | Image processing method and device, electronic device and storage medium | |
| Wang et al. | Extreme low-light imaging with multi-granulation cooperative networks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25711541 Country of ref document: EP Kind code of ref document: A1 |